A full Twitter index in your Thinkup

Posted: April 25th, 2012 | Author: | Filed under: Research, Talks | Tags: , , , , , | 7 Comments »

An interesting bit of news came to light at Privacy International a while back: “What does Twitter know about its users?”

It is possible for residents of the EU to request from Twitter all of the data it has stored about them in accordance with European data protection laws (just follow the steps). Some Twitter users have requested their data and filled in the necessary paperwork. After a while they have gotten all of their records including a file with all of their tweets in it.

I had seen Martin Weber’s post about this before but when I saw Anne Helmond post about her experiences as well, I was prompted to carry out the idea I’d had before: to import an entire Twitter archive into Thinkup to complement the partial archive it contains of my longtime Twitter use (since September 2006).


I use Thinkup myself enthusiastically to supplement existing archival, statistics and API functionality around the web and more importantly to have it under my own control. These services serve as my social memory and it is nice to have a copy of them that can’t disappear because of some M&A mishap. It has proven useful more than once to be able to search through either all of my tweets or all of my @replies. But as noted, Thinkup can only go back 3200 tweets from when first you install it because of Twitter API limits. For people like me (35k tweets) or Anne (50k tweets), that’s just not enough.

I installed a new Thinkup on a test domain and asked for (sample) files from Anne and Martin and went at it. Command-line being the easiest, I took the upgrade.php script, ripped out most of its innards and spent an afternoon scouring the Thinkup source code to see how it does a Twitter crawl itself and mirrored the functionality. PHP is not my language of choice (by a long shot), but I have dabbled in it occasionally and with a bit of a refresher it is pretty easy to get going.

I finally managed to insert everything into the right table using the Thinkup DAO but it still wasn’t showing anything. Gina Trapani —Thinkup’s creator— told me which tables I had to supplement for the website to show something and after that it worked! A fully searchable archive of all your tweets in Thinkup.

web_martin on Twitter | ThinkUp

The code is a gist on Github right now and not usable (!) without programming knowledge. It is hackish and needs to be cleaned up, but it works ((It should scan available instances and only import tweets if they match an instance in your install among many many other things.)). Ideally this would eventually become a plugin for Thinkup but that is still a bit off.

What’s the point of all this? There are a couple:

First it shows that data protection laws such as the ones we have in Europe do have an effect (see also for instance: Europe v. Facebook). Even on the internet laws have teeth and practical applications. Data protection laws can be useful if they are drafted on general principles and applied judiciously.

But the result you get: a massive text file in your inbox is not the most usable way to use or explore half a decade’s worth of social media history. That’s where Thinkup comes in. It’s brilliant functionality serves as a way to make this data live again and magnifies for each person the effect of their data request.

Secondly, for any active user of Thinkup, supplementing their archive with a full history is a definitive WANT feature. Twitter has been very lax in providing access to more than the last 3200 tweets. If a lot of users used their analog API to demand their tweets, Twitter may be forced to create a general solution sooner.

Lastly, Thinkup has applied for funds with the Knight Foundation to turn itself into a federated social network piggy-backed on top of the existing ones. Thinkup would draw in all of the data that is already out there into its private store and then build functionality on top of that (sort of an inverse Privatesquare). Having access to all of your data would be a first step for any plan that involves data ownership and federation.

I presented this hack yesterday at the Berlin Hack and Tell. Your ideas and comments and help are very welcome.

Personal Mapping Platforms

Posted: January 18th, 2011 | Author: | Filed under: Research, Statlas | Tags: , , , , , | 4 Comments »

For the project Statlas we are looking into making a personal mapping platform for journalists. We submitted the grant proposal for this almost half a year ago and the idea had been alive for far longer (we started about this time last year).

It’s good to sea that there is a wider trend in consumer mapping platforms right when we are underway with ours. Here’s a brief survey of the ones we found during a cursory examination. There are bound to be more. If you know them, please let us know in the comments.


ArcGIS has a mapping platform based probably on the ArcGIS server, a paid for cloud mapping platform.

ArcGIS - My Map

Looks nice, like a web based version of Google Maps combined with Google Earth with all the different overlays you can put on there. I tried to create a map and share it on Facebook which oddly enough did not work. The sharing, embedding and standalone map versions do look well thought out but if they don’t work they’re probably not tested well.

View Larger Map


ESRI the company behind ArcGIS has another ‘Make a map’ tool which is a lot more restricted but because of that provides a clearer experience.

Make a Map | Free Embeddable Maps | Embed Map Web Page | Embedded Maps

This doesn’t offer a ridiculous amount of options, but it is very clear and nicely done and the sharing options are also very straight forward. An embed of that map is below:


Dotspotting is Stamen‘s platform for putting dots on a map currently in its ‘SUPER ALPHA-BETA-DISCO-BALL VERSION’. As they describe it, it’s intended to make the process of visualizing city data easier, more open and more robust.

That is pretty much the same reasons we started on this road in the first place. Mapping and data literacy are necessary in web development as well as the other way around: web literacy is necessary for those that make the heavy-duty maps. The two need to meet to create the applications and ease of use we are looking for.

Dotspotting - "My Flickr Photos (Sept 2010)", a sheet of dots by straup

A script to export my Foursquare checkins in an easy way and create a sheet with those is forthcoming. Anyway, Statlas is best described as that: a way to project values onto regions and enable people to play with that dynamic.

Weet meer

Weet Meer got launched very recently in a beta release and is limitedly available up until next month. It does a decent job in displaying the statistics offered by the CBS and offers some statistical relations and tools to compare things with.

Weetmeer.nl - Aantal inwoners

That is a brief overview of what is already out there. We’re glad that we have hit a nice timing to be able to develop ours and fulfill an actual need out there: to be easily able to make maps of a set of values to a group of regions.