It is possible for residents of the EU to request from Twitter all of the data it has stored about them in accordance with European data protection laws (just follow the steps). Some Twitter users have requested their data and filled in the necessary paperwork. After a while they have gotten all of their records including a file with all of their tweets in it.
I had seen Martin Weber’s post about this before but when I saw Anne Helmond post about her experiences as well, I was prompted to carry out the idea I’d had before: to import an entire Twitter archive into Thinkup to complement the partial archive it contains of my longtime Twitter use (since September 2006).
I use Thinkup myself enthusiastically to supplement existing archival, statistics and API functionality around the web and more importantly to have it under my own control. These services serve as my social memory and it is nice to have a copy of them that can’t disappear because of some M&A mishap. It has proven useful more than once to be able to search through either all of my tweets or all of my @replies. But as noted, Thinkup can only go back 3200 tweets from when first you install it because of Twitter API limits. For people like me (35k tweets) or Anne (50k tweets), that’s just not enough.
I installed a new Thinkup on a test domain and asked for (sample) files from Anne and Martin and went at it. Command-line being the easiest, I took the upgrade.php script, ripped out most of its innards and spent an afternoon scouring the Thinkup source code to see how it does a Twitter crawl itself and mirrored the functionality. PHP is not my language of choice (by a long shot), but I have dabbled in it occasionally and with a bit of a refresher it is pretty easy to get going.
I finally managed to insert everything into the right table using the Thinkup DAO but it still wasn’t showing anything. Gina Trapani —Thinkup’s creator— told me which tables I had to supplement for the website to show something and after that it worked! A fully searchable archive of all your tweets in Thinkup.
The code is a gist on Github right now and not usable (!) without programming knowledge. It is hackish and needs to be cleaned up, but it works ((It should scan available instances and only import tweets if they match an instance in your install among many many other things.)). Ideally this would eventually become a plugin for Thinkup but that is still a bit off.
What’s the point of all this? There are a couple:
First it shows that data protection laws such as the ones we have in Europe do have an effect (see also for instance: Europe v. Facebook). Even on the internet laws have teeth and practical applications. Data protection laws can be useful if they are drafted on general principles and applied judiciously.
But the result you get: a massive text file in your inbox is not the most usable way to use or explore half a decade’s worth of social media history. That’s where Thinkup comes in. It’s brilliant functionality serves as a way to make this data live again and magnifies for each person the effect of their data request.
Secondly, for any active user of Thinkup, supplementing their archive with a full history is a definitive WANT feature. Twitter has been very lax in providing access to more than the last 3200 tweets. If a lot of users used their analog API to demand their tweets, Twitter may be forced to create a general solution sooner.
Lastly, Thinkup has applied for funds with the Knight Foundation to turn itself into a federated social network piggy-backed on top of the existing ones. Thinkup would draw in all of the data that is already out there into its private store and then build functionality on top of that (sort of an inverse Privatesquare). Having access to all of your data would be a first step for any plan that involves data ownership and federation.
I have written here before about the need for web developers to learn more about GIS technologies and how to either work with or work around the traditional geographical software packages and data formats. There is a lot of synergy to be achieved in working together.
In the summer lull over at Hack de Overheid we are organizing a day of programming at a fortress which in itself already is a unique event: Apps for Noord Holland. But during the day the people from ESRI will give a workshop about geo data which we think is very worthwhile for any programmer who wants to get started in this field.
So if you want to spend a day on a fortress learning about GIS and programming, go right ahead and register. It promises to be a terrific day.
Here are the slides for a talk I gave at /dev/haaglast Friday ambitiously titled “Fixing Reality with Data Visualization” which was well received. I promised to write it up here, so here it is.
Starting off with some introductions. We are Monster Swell, this equation is the central challenge of our practice.
To start with the title inspiration for this talk. I recently finished this book by Jane McGonigal.
“Reality is Broken” by Jane McGonigal recently came out and it’s not really true, but it’s quite opportune. Reality isn’t broken, but there is —as always— lots that can be improved. Slapping a gamification label on that is a false exit because it implies that such improvement can be done easily by the magic of games.
The core idea of the book is that:
1. Reality can be fixed by game mechanics (voluntary participation, epic stories, social collaboration, fitting rewards), and
2. That reality should be fixed by game mechanics.
Both of these points: the possibility and the desirability of such are the subject of fierce debate both within game design circles and without.
We are now seeing a superficial trend of gamification, badge-ification and pointification where everybody is rushing forward to add as many ‘game-like’ features to their application/concept to look tuned into the fun paradigm.
Fortunately this does not work. Checking in for points and badges is fun at first, but is hardly a sustainable engagement vector. Foursquare mostly did a bait and switch with their game until they got enough critical mass to be useful along other vectors.
Things that are difficult remain difficult even if they are gamified. ‘An obstacle remains an obstacle even with a cherry on top.’
Ian Bogost terms this exploitationware. Our own discussions concluded with that if you are not the one playing, you are being played.
In our practice we look for deeper ways to engage people and affect them. There are hardly any one-to-one mappings to be found and the effects that are most worthwhile are the higher order ones. As Kars Alfrink says:
“We don’t tell them to coordinate, we create a situation within which the way to win is to coordinate.”
Corollary: A game about violence does not immediately make people violent.
Coming back to the map parallel, this picture of center pivot irrigation systems (by NASA) in Garden City, Kansas looks awfully similar to the goban and this is just an aerial photograph with some processing applied to it.
So to come to this point:
‘Any sufficiently abstract game is indistinguishable from a data visualization.’
The difference just is that a game is a visualization of a game model and its rules. The whole point of playing a game is learning those rules and uncovering the model of the game is essence ‘breaking’ a game. After this point it usually ceases to be fun.
And its complementary point:
‘Any sufficiently interactive data visualization is indistinguishable from a game.’
And indeed the best ones are highly interactive and offer various controls, abstraction levels and displays of data deep enough to engage users/players for a long time. It is also the reason that in our practice we don’t occupy ourselves much with visualizations in print media.
To continue the point about games: many games are either quite concrete or very abstract simulations. This is most obvious with sim games such as Sim City pictured below.
Simulations are subjective projections of reality both because of the choices that the designer of the simulator has embedded in their choices for the projection and because of the interpretation of the player of the simulation and how their ingrained notions allow them to interpret the simulation.
Bogost says that all games in some way are simulations, and that any simulation is subjective. The response people have to this subjectivity is one of either resignation (uncritically subjecting oneself to the rules of the simulation, taking it at face value) or of denial (rejecting simulations wholesale since their subjectivity makes them useless). Taken together, Bogost calls these reactions simulation fever. A discomfort created by the friction between our idea of how reality functions and how it is presented by a game system. The way to shake this fever, says Bogost, is to work through it, that is to say, to play in a critical way and to become aware of what it includes and excludes.
I think we could use the correspondence between games and visualizations to coin a corresponding term called Visualization Fever.
Those are my most important points, that good and interesting games and good and interesting data visualizations share many of the same characteristics. We can use data and its correspondence with reality (or lack thereof) to create a similar fever.
(This graphic is somewhat rudimentary but it was made within Keynote in five minutes and I hope it gets the point across.)
The visualization process shares a lot of similarities with the open data process that we are involved in. It is a perpetual conversation and the visual part is only one place where it can be improved. Data collection, discussion on results and errors, sharing of data and the resulting products, controllability of the outputs and being able to remix and reuse them and incorporating this process as feedback back into atoms are all areas that need active participation.
There is nothing easy about this. It is a ton of hard work and long tedious conversations. Fortunately most of it is worth it.
Some examples of visualization fever in action.
Verbeter de Buurt is the Dutch version of See Click Fix and it works really admirably. It creates a subjective map of an area with the issues that a group of people have signalled in their neighborhood. Nothing really is said about who these people are and if these issues are indeed the ones that are the most pressing (we all know the annoying neighbour who complains about dog poo to whomever will hear it). By making issues visible, this map imposes its view of the city onto the councils and exerts change.
Planning systems at an urban scale is a very difficult process. These planning stages are being opened up to the general public using consultation and other means but it remains to be seen if and how citizens can comprehend the complex issues that underlie city planning.
One step to help both experts and laypeople to better come to grips with the city that they are inhabiting is to create macroscopes that in one view show the entire scale and all the things that are in a system in such a way that we can make (some) sense of it. These Flowprints by Anil Bawa-Cavia are a great example of doing such for public transportation.
And done right these visualizations can reveal the systems of the world or in this case the order flow of trains in the Netherlands. Everybody knows how crowded Dutch rail is, which trains go where along which routes, but actually seeing it happening in front of your eyes in a real-time visualization gives you an insight and a tangible grip on the system that you did not have before.
So what do we fix?
We use visualizations and their compressed interactive views to expose system design choices and errors. They can also be used to give depth to a specific point, something which journalists are increasingly finding necessary. People consuming data heavy news want to be able to poke that data themselves.
A lot of visualizations I have seen thusfar serve not much more than to reinforce pre-existing judgements almost as if the person creating the visualization sought to build that which they wanted to see. Visualizations will need to be better, more flexible and draw upon more data if we want to break out of these throughs of shallow insight.
The brief as stated by the nice people at Bloom as well is that having a visualization serve solely as a visual output is too limited a use of the interactions created. You should be able to use the same interactions in the visualization to also influence the underlying model either directly or indirectly. That is to say the model and the representation should be bidirectionally influencing.
Planetary, the latest app by Bloom is a great example of that. It shows you a beautifully crafted astromusical view, but it also allows you to play your music library from within that very same visualization.
We need to bring visualization and deep data literacy to the web and infuse any relevant site and system (that is to say all) with them. Many people asking for data visualization think that they are some magical fairy dust that will make a site awesome by its very touch. This is of course not true.
Data and interactive visuals can generate value and insight for any site that employs them properly.
In the presentation Data Visualization for Web Designers by Tom Carden he remarks that web developers already know how to do all this. These are exactly the tools we have been employing over the last years to create interactive experiences (and we plan to use them more and more).
Internet Explorer is still the cripple old man of the web, but given understanding clients (and users) and some compatibility layers, you may be able to get away with using a lot of this stuff as long as the result is awesome enough.
The other trend is the idea that there need to be bridges built between web people and GIS people. Preferably how to create GIS-like experiences using the affordances that the web necessitates. A trend we were thinking about neatly summarized (blog) at a #NoGIS meetup by Mike Migurski.
GIS people have tremendous tools and knowledge but they are not accustomed to work in a very web way: quick, usable, beautiful. Web people can build nice sites pretty quickly, but they tend to fall flat when they need to work with geographical tools that are more complex than the Google Maps API.
If we can combine these two powers, the gains will be immense.
We can create subjective views to exert power upon reality and try to fix things for the better. The subjectivity is not a problem, as often the values embedded in the views are the very point. Subjectivity creates debate and debate moves things forward.
The tools we have to create these views are getting ever more powerful, but there is also a lot of work to be done.
As a wise man said: “The best way to complain is to make things.” (picture)
Our Alper has joined the board of Hack de Overheid a Dutch think tank that creates software and events to advance thinking about transparent government and open data in the Netherlands. Actually more of a do tank in that respect.
Each year Hack de Overheid holds a developer day where civically inclined programmers gather to exchange knowledge and create new open data projects either with government’s consent or without.
This year the devcamp is part of a broader program along with an application contest for local data and local applications in the city of Amsterdam called Apps for Amsterdam. There is a lot of momentum and it looks like open data is finally being taken seriously.
Until the event, updates here may be a bit sparse, but do register for the March 12th event if you have any interest in data and let’s create something great together.
For the project Statlas we are looking into making a personal mapping platform for journalists. We submitted the grant proposal for this almost half a year ago and the idea had been alive for far longer (we started about this time last year).
It’s good to sea that there is a wider trend in consumer mapping platforms right when we are underway with ours. Here’s a brief survey of the ones we found during a cursory examination. There are bound to be more. If you know them, please let us know in the comments.
Looks nice, like a web based version of Google Maps combined with Google Earth with all the different overlays you can put on there. I tried to create a map and share it on Facebook which oddly enough did not work. The sharing, embedding and standalone map versions do look well thought out but if they don’t work they’re probably not tested well.
ESRI the company behind ArcGIS has another ‘Make a map’ tool which is a lot more restricted but because of that provides a clearer experience.
This doesn’t offer a ridiculous amount of options, but it is very clear and nicely done and the sharing options are also very straight forward. An embed of that map is below:
Dotspotting is Stamen‘s platform for putting dots on a map currently in its ‘SUPER ALPHA-BETA-DISCO-BALL VERSION’. As they describe it, it’s intended to make the process of visualizing city data easier, more open and more robust.
That is pretty much the same reasons we started on this road in the first place. Mapping and data literacy are necessary in web development as well as the other way around: web literacy is necessary for those that make the heavy-duty maps. The two need to meet to create the applications and ease of use we are looking for.
A script to export my Foursquare checkins in an easy way and create a sheet with those is forthcoming. Anyway, Statlas is best described as that: a way to project values onto regions and enable people to play with that dynamic.
Weet Meer got launched very recently in a beta release and is limitedly available up until next month. It does a decent job in displaying the statistics offered by the CBS and offers some statistical relations and tools to compare things with.
That is a brief overview of what is already out there. We’re glad that we have hit a nice timing to be able to develop ours and fulfill an actual need out there: to be easily able to make maps of a set of values to a group of regions.
The technology demo of Dutchstats as presented during Hack de Overheid last year has been a nice trigger for further development along that axis. For now the project under codename Dutchstats2 —a new name and identity is forthcoming— will be underway.
We got the announcement a couple of weeks ago that our proposal for subsidy had been accepted. We spent December gauging interest, taking in the project and building a team that can execute this in Q1 2011. Team introductions forthcoming after we’ve kicked it off.
The assignment is still the same one that prompted the original Dutchstats:
Given a set of values for a set of geographical regions visualize the mapping from the values to the regions in a way that is interesting, useful and pleasant.
Simple enough to be doable. Broad enough to be generally applicable.
The original Dutchstats was mainly concerned with Dutch municipalities as geographical regions and election results as values and we will be continuing along that line, but we will be looking into opening up both the values and also the geographical regions for anybody who has something to contribute to either. The idea is to create a generative atlas.
A generative atlas mostly to see if we can give the concept of an atlas new currency in the online world.
In the Netherlands there is an atlas called the Grote Bosatlas which still is the standard atlas for everybody in and out of school. But asking people around the question: when is the last time you have even thought of an atlas, let alone got and leaved through a Bosatlas, everybody draws a complete blank. Google Maps has supplanted most of the topographical and wayfinding functionality of paper maps and atlases to the extent that it has wiped out the original concept out of people’s heads.
The social geographical function of the atlas has been replaced by a ton of projects working either with or on Google Maps/Earth using GIS or placing points on the map (using location or geocoded data), Stamen’s Dotspotting is a good example of that. Besides those web centric approach there’s also a slew of closed/semi-closed mapping tools from statistical offices, government bodies etc. that are built on poor and closed technology and are limited to the task at hand (which they usually do poorly at that).
We’re going to determine as we go the technology that we’re going to use, but the project needs to be webcentric and is allowed to be bleeding edge (though perhaps not as bleeding as the original prototype) so I hope we can avoid using Flash completely.
Depending on how much of the base components are already available (data stores, tile servers, rendering engines), we will be focusing more on the application part. But if such components are not yet available or up to par, we will be investing in building them ourselves.
In our practice we believe in standing on the shoulders of giants, sharing alike and giving credit where credit is due. We will be doing this project completely in the open not because we don’t have a customer for it but because everybody is a potential customer and they should be able to see and participate from the earliest stages on.
Any software that we produce will be released under a very liberal open source license. So that anybody can use our stuff and we hope to advance the state of mapping online in our own modest amount. Also all our design research and progress will be posted to this blog in chunks of a week or a bit more (depending on our sprints).
Fully open is the only way we can imagine doing this. We hope you will join us.
Alper used the opportunity to take three minutes to address the council before the meeting and posted a call to action for better and more effective digital public services using open data and asked the city to open up more of its data.
When the proposal was finally treated it was adopted near unanimously (tweet) by the entire council with also a positive recommendation by the alderman. The alderman commented that because he used to be an open source developer, an open data project had been on his list of things to do for a while now and he welcomed this proposal. His idea was to spend the allocated €10’000 on projects in the form of bounties to maximize the effectiveness and first grab the low-hanging fruit.
An open data policy will provide benefits to the City, which include:
enhanced government transparency and accountability
development of new analyses or applications based on the unique data the City provides
mobilization of San Francisco’s high-tech workforce to use City data to create useful civic tools at no cost to the city
creation of social and economic benefits based on innovation in how residents interact with government stemming from increased accessibility to City data sets
City departments should take further steps to make their data sets available to the public in a more timely and efficient manner.
It would seem that the time is now ripe to push this agenda through local legislative bodies. Given the current trend towards better digital services and transparency a suitably drafted proposal for open data with a realistic goal can scarcely have any opponents.
We’re going to look into passing more proposals towards open data like this following the lead of Amsterdam.
Update: the minutes for the commission meeting have been posted: Dutch PDF