We help people with their data by creating compelling visual representations and interactive tools.

Designing in the Face of Defeat

Posted: May 13th, 2012 | Author: | Filed under: Inspiration, Talks | Tags: , , , , , , , , , , , , , , | No Comments »

I called this talk I gave for the Willem de Kooning Academy’s CrossLab night ‘New Design for a New Aesthetic’ initially, but I reconsidered that title. Not because of the person who took semantic issue with the idea of a ‘new aesthetic’, I couldn’t really care less about that. The idea that there can be a new design that addresses the issues within the New Aesthetic is just too ambitious. We cannot possibly succeed which is why I’m calling this discipline we’re engaged in: designing in the face of defeat (I blogged about this before) and it is what we will be doing for the foreseeable future.

I pre-rolled a screencapture of Aaron Straup Cope’s Wanderdrone to add ominous foreboding to the mix of design/advertising enthusiasm permeating the room. Crosslab called the night a night about Dynamic Design, which I didn’t really get, but I retook an old talk about algorithmic design but now heavily updated to incorporate current thinking about algorithms, the new aesthetic and object oriented ontology.

notlion.github.com/streetview-stereographic/#o=.022,.363,-.645,.673&z=1.623&mz=18&p=52.50029,13.41857
Our offices in Berlin-Kreuzberg

Given the fact that we as Monster Swell are a company that does a lot of stuff with maps we are affected by the fact that mapping is being turned on its head. And it’s not because there aren’t enough interesting maps, there are now more than ever before. Just to show a couple.

Eric Fisher’s Twitter Traffic Maps of various cities, here New York:
Is this the structure of New York City?

Google Hotel Finder with an isochrone projection in London:
Google Hotel Finder

Timemaps, a map of the Netherlands distorted by the amount of travel time required during various times of day:
Timemaps

But right now there is a projection inversion going on where a lot of the time we no longer project the real world onto flat surfaces and call that maps, but where we overlay maps themselves back onto reality. And what we call maps does not need to have any relation with physical reality anymore, we can map anything onto anything using any (non-)geometric form we choose.

This is mainly a consequence of us putting the internet into maps. But if you think again, the internet is not the only place where we put maps. We put the internet into pretty much everything by now.

So maps are creeping back into the real world and we get odd clashes when we try to overlay a map back onto the territory or when we try to perfectly capture a capricious world, as you can see in these Google Maps and Street View examples: 1, 2, 3, 4. I don’t know how long they will be online over at the New Aesthetic Tumblr since that has been closed.

We got QR codes to enable the machine readable world. These hardly have any real world use (just go over to the WTF QR Codes Tumblr) but they function more as cultural icons, precursors of a strange and inscrutable future.

And even more interestingly they are being used for instance by the Chinese to calibrate spy satellites. So these are maps on the earth that are being used to create better maps of the earth.

The New Aesthetic is when this kind of projection inversion happens more widely, not just in the realm of maps, but in all of the places in the world that the internet touches. By now that is nearly everything. The examples that were being collected over at the New Aesthetic Tumblr showed how the arts were picking up on this trend.

All of these things have been created by algorithms which are not as mysterious as many people make them out to be. Algorithms are how computers work and increasingly how the world works. They codify behaviour and quoting Robert Fabricant, as designers ‘behaviour is our medium’. Being a designer should entail more than a passing knowledge of and proficiency with algorithms. We are moving into a world where creative work is becoming procedural. The most important media are prescriptive and set rules for the world more than they are descriptive and depict the world.

The real problem with algorithms is that they often involve us but they are completely alien to us (in the Bogostian sense). They are operationally closed. Operational Closure means that things may work in ways that are not at all obvious to us, neither at first nor after we poke into them because any kind of sense we make of it is either partial or does not translate into our frame of reference. Algorithms get inputs and perform outputs but the way they operate on these has nothing to do with how we as humans think about the world. Think is not even the right word, but we try to relate to them from our human cognition. The machines see us, but they do not ‘see’ us in any way we would recognize as seeing and we have no idea what it is that they see.

The Machine Pareidolia experiment over at Urban Honking is a good example.

281

The ability to see faces in things is a basic aspect of our visual pattern recognition. When we teach that same skill to computers we get unexpected consequences. It is the same with the flash crash on the stock market that happens in the blink of an eye without anybody really knowing what caused it. The rationales of the algorithms are opaque to us and their emergent behaviour unpredictable.

As Kevin Slavin mentioned in an interview: the more autonomous the algorithms are and the more effects they have on our daily lives, the more we may be accommodating them without realizing it.

There was this story recently that scientists have created a robot fish that is so good at mimicking the behaviour of regular fish that it can become their leader. This is what worries me. Who says we are not all following robot fish most of the time?

So that is what I think is the biggest challenge right now for designers. Try to create systems that harness the open and generative power of the internet while on the other hand remaining human and aligned with human interest. One way would be to make the internals of algorithms transparent so people can enter into an informed relationship with them.

Unfortunately there are no magic bullets for this whatever your local design visionary has been telling you. There never have been. Everything is made up of withdrawn objects that are mediated towards one another with unexpected consequences. To quote Graham Harman from the Prince of Networks:

“the engineer must negotiate with the mountain at every stage of the project, testing to see where the rock resists and where it yields, and is quite often surprised by the behaviour of the rock.”

There are no ideas that will solve all problems, there are no products that will do everything. There is only the work through which we may gain more understanding and make better things. So with that, I hope we all can do good work.


A full Twitter index in your Thinkup

Posted: April 25th, 2012 | Author: | Filed under: Research, Talks | Tags: , , , , , | 5 Comments »

An interesting bit of news came to light at Privacy International a while back: “What does Twitter know about its users?”

It is possible for residents of the EU to request from Twitter all of the data it has stored about them in accordance with European data protection laws (just follow the steps). Some Twitter users have requested their data and filled in the necessary paperwork. After a while they have gotten all of their records including a file with all of their tweets in it.

I had seen Martin Weber’s post about this before but when I saw Anne Helmond post about her experiences as well, I was prompted to carry out the idea I’d had before: to import an entire Twitter archive into Thinkup to complement the partial archive it contains of my longtime Twitter use (since September 2006).

http://thinkupapp.com/

I use Thinkup myself enthusiastically to supplement existing archival, statistics and API functionality around the web and more importantly to have it under my own control. These services serve as my social memory and it is nice to have a copy of them that can’t disappear because of some M&A mishap. It has proven useful more than once to be able to search through either all of my tweets or all of my @replies. But as noted, Thinkup can only go back 3200 tweets from when first you install it because of Twitter API limits. For people like me (35k tweets) or Anne (50k tweets), that’s just not enough.

I installed a new Thinkup on a test domain and asked for (sample) files from Anne and Martin and went at it. Command-line being the easiest, I took the upgrade.php script, ripped out most of its innards and spent an afternoon scouring the Thinkup source code to see how it does a Twitter crawl itself and mirrored the functionality. PHP is not my language of choice (by a long shot), but I have dabbled in it occasionally and with a bit of a refresher it is pretty easy to get going.

I finally managed to insert everything into the right table using the Thinkup DAO but it still wasn’t showing anything. Gina Trapani —Thinkup’s creator— told me which tables I had to supplement for the website to show something and after that it worked! A fully searchable archive of all your tweets in Thinkup.

web_martin on Twitter | ThinkUp

The code is a gist on Github right now and not usable (!) without programming knowledge. It is hackish and needs to be cleaned up, but it works ((It should scan available instances and only import tweets if they match an instance in your install among many many other things.)). Ideally this would eventually become a plugin for Thinkup but that is still a bit off.

What’s the point of all this? There are a couple:

First it shows that data protection laws such as the ones we have in Europe do have an effect (see also for instance: Europe v. Facebook). Even on the internet laws have teeth and practical applications. Data protection laws can be useful if they are drafted on general principles and applied judiciously.

But the result you get: a massive text file in your inbox is not the most usable way to use or explore half a decade’s worth of social media history. That’s where Thinkup comes in. It’s brilliant functionality serves as a way to make this data live again and magnifies for each person the effect of their data request.

Secondly, for any active user of Thinkup, supplementing their archive with a full history is a definitive WANT feature. Twitter has been very lax in providing access to more than the last 3200 tweets. If a lot of users used their analog API to demand their tweets, Twitter may be forced to create a general solution sooner.

Lastly, Thinkup has applied for funds with the Knight Foundation to turn itself into a federated social network piggy-backed on top of the existing ones. Thinkup would draw in all of the data that is already out there into its private store and then build functionality on top of that (sort of an inverse Privatesquare). Having access to all of your data would be a first step for any plan that involves data ownership and federation.

I presented this hack yesterday at the Berlin Hack and Tell. Your ideas and comments and help are very welcome.


Early 2012 Schedule

Posted: February 3rd, 2012 | Author: | Filed under: Events | 1 Comment »

The year has started nicely and we already have a nice line-up of events. Thursday a week ago saw the iBestuur Congress in the Netherlands the winners of the Apps voor Nederland competition were announced. I’m pleased we managed to shape the data and developer programme of this national event and how it turned out. See a write-up of the winners over at the Hack de Overheid site. Future plans along the same track are already being worked on.

There are two upcoming events at which I will be speaking that bear mentioning here.

There will be an evening in Pakhuis de Zwijger to celebrate the Nederland van Boven television series that the VPRO produced in the Netherlands ((Borrowing conceptually from Britain from Above among others.)). I will be joining the esteemed panel there as a board member of Hack de Overheid to talk about issues of democracy, participation and truth in cartography.

The week after that there’s “Social Cities of Tomorrow”. I will be speaking in a brief timeslot about Apps for Amsterdam about how you can create a data commons for your government of organization and where to take it from there.


Parliamentary Interruptions

Posted: January 1st, 2012 | Author: | Filed under: Parliamentary Interruptions | Tags: , , , , , , | 3 Comments »

Last week Sargasso had procured a dataset of interruptions from politicians in our House of Representatives. With the counts from which politician had interrupted which in debates they had made some nice infographics and a couple of blog posts. I thought this was the ideal opportunity to put all of the data (aggregated by party) in the D3 example chord diagram.

Never having used D3 before this was an ideal excuse to learn it and a near ideal dataset to employ. The result is as follows (click through for the interactive version):

Interrupties van en naar kamerleden van elke partij

This was featured on Sargasso the next day.

The graphic is not directly clear, but the data is deep and interesting enough to afford some exploration and it yields insight into the behaviours of various political parties during the reign of this cabinet. And what seems to matter a lot to people: it looks quite pretty.

With regard to D3, I think I will use it more often. It works quite similar to Protovis with which we have done some stuff before, but it feels much more current. Protovis itself is discontinued in favor of D3 according to a notice on the site and D3 seems a very worth successor.


How Erlang and the dutch railways power a real-time data visualization

Posted: November 8th, 2011 | Author: | Filed under: TIMEMAPS | Tags: , , , , , , , , | No Comments »

The backend of the TIMEMAPS project is based on the Zotonic web framework and Erlang. This article highlights the technical challenges and concessions that were considered while building the visualization.

The NS API and its limitations

The NS, the dutch railway system provider, provides an API which allows a developer to build upon it. While it is a nice effort and opens up a lot of possible applications, we found out that for the TIMEMAPS project it was not an ideal API to work with.

But, our requirements were pretty ambitous to begin with, and, from a practical point of view, not what an API designer would call “typical”. In TIMEMAPS, given any point T in time, we need to know, for every train station, how long it takes to travel at moment T to any other train station in the netherlands. Even for a small country like the Netherlands, this becomes a pretty big matrix of travel possibilities, given that there are 379 train stations in the country.

Ideally, for every element in this matrix an API call has to be done to get the actual planning.

Given that the NS API only allows an app to do up to 50.000 requests per day and we did not want to hammer the already stressed API servers too much, we needed to come up with a solution, while not sacrificing the real time aspect too much.

An open source travel planner..?

Another API call that the NS offers are the “Actuele vertrektijden”: given a station, return the 10 first trains that depart from it. It returns also the train numbers: a “unique” number which is assigned to a train on a single trajectory for the day (it might be re-used though in time). By linking the departure times from different stations through this train number, it should possible to see when a train that departs from A passes through B, if it is on the same trajectory.

However, some drawbacks popped up while implementing this approach.

  • For long trajectories (>1h) this approach did not work since the arrival station did not yet list the departure of the train you departed on since it was too far in the future
  • There was no API call for arrival times for trains on stations: this made it impossible to take the stopover-time into account and it was not possible to use this planning mechanism for destinations on the very end of the trajectory (e.g., no departure listed for the arriving train)
  • Doing a “naive” planning this way takes a considerable amount of database processing power as each stopover adds 2 self-joins to the database query, thus increasing exponentially in complexity.

Scraping of the departure times gives a increasingly complete graph of the railway system, and this graph, combined with the geographical location of stations might be used in a search algorithm to make an offline planner. For me however this aproach was too far of a longshot for the already pretty complex project so I decided to put this approach in the fridge for now.

However, this effort has brought me in contact with the OpenOV guys who are dedicated to liberate all public transportation data in the Netherlands. In the future, I hope I can contribute something to their wonderful initiative.

Doing consessions

Luckily, the TIMEMAPS project had one “business rule” with respect to its visualization: only stations that are near the border of the map are allowed to modify the map. That made the list of stations considerably smaller: after selection there were 60 stations left.

However this limited the practical application of the map in that some of the displayed travel times are not accurate: for the remaining, smaller / non-border stations we chose to interpolate the travel times between the “main” stations: an inaccuracy, given the fact that it often takes longer to travel from a minor station (e.g. Eindhoven Beukenlaan) to any other city. But for the sake for the clarity of the visualization, we agreed on this concession.

Data model & worker processes

There are two worker processees running in the background.

One process constantly (approximately 1 request per 1.5 second) queries the NS API for any A → B trip that has no planning in the future. This process favors distance: it tries first to find plannings for longest A → B trajectories, since the NS API also returns every timing information for intermediate stops, allowing to get more than one planning per API request. This planning information is stored in the database and kept for at least a week.

Table "public.static_planning"

 Column             |            Type             | Modifiers
--------------------+-----------------------------+------------------------
 id                 | integer                     | not null
 station_from       | character varying(32)       | not null
 station_to         | character varying(32)       | not null
 time               | timestamp without time zone | not null
 duration           | integer                     | not null
 ns                 | boolean                     | not null default true
 fetchtime          | timestamp without time zone |
 spoor              | character varying(32)       |
 aankomstvertraging | integer                     |
 vertrekvertraging  | integer                     |

Another process constantly queries the Actuele Vertrektijden API for every station (not only border stations). This information is used for the “fallback” scenario of step 3), in which no real planning is found for the station combination and we fall back on a fixed travel time, but do include the scraped departure time.

Table "public.vertrektijd"
     Column     |            Type             | Modifiers
----------------+-----------------------------+-----------
 station        | character varying(32)       | not null
 time           | timestamp without time zone | not null
 vertraging     | integer                     | not null
 ritnummer      | integer                     | not null
 eindbestemming | character varying(32)       |
 fetchtime      | timestamp without time zone |

Building the travel time matrix

The current map exposes an API to the N^2 matrix of the current time at the URL /api/reisplanner/actueel. It is a JSON long list where each entry looks like this:

["std",
 "amf",
 "2011-10-30 13:13:00",
 7440,
 "2b"]

This particular entry shows that the next train from Sittard (std) to Amersfoort (amf) leaves on 13:13h, from track 2B and takes 7440 seconds (2 hours and 4 minutes). For every station to another station (for the “border stations”) there is an entry in this list.

A second URL, /api/reisplanner/history?date=2011-10-29T22:00:00Z, gives this list for a certain date in the past.

Given the fact that we were unable to query every planning in real time, these results are build up in a three-step phase:

  1. Given each station A, B, check if there has been a planning retrieved for A → B for which the start time is in the future. Return the planning that is closest to the current time.
  2. Failing condition 1), check if there has been a planning retrieved for A → B last week. Return the planning that is closest to the current time minus 7 days. We assume that for every day of the week, the planning is the same. Note that this does not hold for holidays / festive days.
  3. Failing condition 1) and 2), return the planning tuple in which we assume a constant, pre-fetched travel time (a static matrix for times between A and B without time information). We assume that the first train leaving for A is the right train for getting to B.

A combiner algorithm retrieves for every station-to-station combination the results from step 1, otherwise those from step 2 and as final fallback step 3 (which always has a result, although it might not be accurate).

mod_reisplanner – the module making all this happen

Above processes have all been implemented in Erlang as a module for Zotonic. It will be open-sourced soon, so that it hopefully can serve as a basis and/or inspiration for other applications using Erlang and the NS API.

This article is the second in a series about the TIMEMAPS project.  TIMEMAPS’ concept and design are by Vincent Meertens, the implementation is by Arjan Scherpenisse.


Introducing Arjan Scherpenisse

Posted: November 7th, 2011 | Author: | Filed under: TIMEMAPS | Tags: , , , | No Comments »

It is my pleasure to introduce here on Monster Swell a new collaboration and a spectacular piece of work. Arjan Scherpenisse of Miracle Things will be collaborating with us in the field of data visualization.

Arjan is that rare breed of artist né programmer formally trained in both but picking neither side. He is active on the most innovative edge of software as well as building physical interaction projects and schooling others in programming be it in Erlang or some other language.

The TIMEMAPS project written up just before this post is the first of we hope many forays into data visualization for Arjan and we look forward to collaborate on many such projects in the future.


TIMEMAPS: a different perspective

Posted: November 7th, 2011 | Author: | Filed under: TIMEMAPS | Tags: , , , , | 3 Comments »

TIMEMAPS visualizes how the map of the Netherlands would look if it would be scaled proportionally to the travel times (by train) between cities. I was asked by the designer of the concept, Vincent Meertens of graphsic, to transform his manually crafted PDF files into a real-time, interactive visualization. TIMEMAPS has been exhibited at the Graduation Show event during the Dutch Design Week 2011.

The map is a real-time interactive map. Clicking a city allows one to set the perspective to a city of his choice. Hovering the map shows a pop-up which highlights the time it takes to travel to the city the mouse is currently over. Every coloured “ring” on the map denotes 30 minutes of travel time, at the current time.

Drawing the map with Canvas

The visualization is done using the HTML5 canvas. Why canvas, and not just SVG, one would ask? Good question: I wanted to learn more about the canvas and thus was a bit biased. I think the project could have been done with SVG as well.

The map consists of a set of polygons: the outline of the Netherlands and its various islands. All the cities are located on those shapes with all 379 train stations. Furthermore, there are several bridges between the islands, like the big “Afsluitduik”, which each connect 2 vertices of the polygons.

The initial, un-transformed shape of the country and the station positions is the same as that on the famous yellow overview map that the NS uses in the stations: it is a schematic view of the Netherlands, constrained in a grid of 0, 45 and 90-degree lines.

The drawing algorithm first draws all the polygons and bridges, and subsequently fills those areas with a pattern of colored concentric circles. This is done in canvas by blitting the previous shape with a pre-rendered image of the circles using the compositeOperation method. The distances between the circles are scaled to represent 30 minutes of travel time. Then, the cities are drawn as big/small dots (main stations are bigger) and connected to the current city by a thin white line.

The information hovers (a plain HTML div) are done by using the “mousemove” event on the canvas and calculating which city is the closest to the current mouse location. Clicking a city causes the current perspective to shift to the clicked city in an animated fashion, using a simple (cosine) transition.

Map deformation

The angles at which cities view each other are kept constant. So, for example, viewed from Rotterdam, Utrecht centraal is always at a 45-degree angle, regardless of the time it takes to travel from Rotterdam to Utrecht. The actual city location is scaled proportionally along these angles: if it takes less time to travel, the city is pulled closer; if it takes more time it is pushed further away. But the angle remains constant.

The polygons (that make the actual shape of the map) are “magnetic” and each vertex “sticks” to the cities it is initially closest to, in a weighted fashion. This algorithm is loosely based on the article “Feature point based mesh animation applied to MPEG-4 facial animation”.

For the islands, this mesh-stretching was mixed 60%/40% with a simple vertex displacement to prevent the islands from becoming unrecognizable: since there are no stations on islands, they are prone to more deformation since the feature points (cities) lie further away.

Problems in the visualization

The shape of the map sometimes is deformed beyond recognition because in certain cases cities which are normally close are being pushed away beyond cities that are normally far away: thus causing the polygon to turn “inside out” and cause cities to appear to be located in the sea.

Another issue is the 45-degree grid constraint: the mesh stretching algorithm does not take this into account because this constraint is applied in a later calculation stage: this sometimes causes cities to be located in the sea as well. A temporary solution for this was to add more vertices to the polygons so the map had more flexibility while stretching.

Application to other maps

The Netherlands is a pretty ideal country in the way the transportation system is organized: viewed from the center, “de Randstad”, or Utrecht or Amersfoort, it is indeed so that travel times do increase almost linearly with geographical distance. I do not think this holds for every country: especially with the advance of faster railways (the fast Fyra train was not taken into account in our implementation!), the map might deform in ways that are beyond recognition and beyond representation in the 2D domain.However it might be an interesting experiment to apply the same techniques to a different country.

This article is the second in a series about the TIMEMAPS project. TIMEMAPS’ concept and design are by Vincent Meertens, the implementation is by Arjan Scherpenisse.


Foursquare Map for Leidse Square ‘Entertainment Area’ in Effect

Posted: September 5th, 2011 | Author: | Filed under: Foursquare Map | Tags: , , , , , , , | 1 Comment »

Finally got around to go the AUB Ticketshop at Leidse Square during the daytime to view the Foursquare Display we setup in action (previous blog post).

A video of the screen:

The screen in context:
Screen in context

It is a welcome refresher from the static posters and the static videos that usually litter these high profile locations. The foursquare coloured view of the area is always fresh and shows a view on the local flavour and the people that visit the venues around.

From an urban development point of view it may be odd to draw more attention to the already highly crowded Leidse Square area. But it comes to reason that new developments such as these will be tested on high density locations first. We would be very interested to create augmentations in public space to make locations in Amsterdam’s periphery more appealing.


Hackathons as gateways to more and better open data

Posted: August 22nd, 2011 | Author: | Filed under: Inspiration | Tags: , , , , | 1 Comment »

There is a piece up on O’Reilly Radar by Andy Oram about the sustainability of applications built during hackathons. I am involved in Hack de Overheid and we have organized (Apps for Amsterdam) and still are organizing (Apps for Noord Holland) several hackathons and I thought it would be good to add our experiences to the fray.

First: I do not agree with the premise that most apps created in government challenges are quickly abandoned. I have not done a tally of our Apps for Amsterdam contest, but the completeness and polish of most apps submitted was impressive. I still use several of the apps from that contest regularly. Snelstepontje.nl for finding out which ferry to take is a godsend just to name one.

Maintenance is indeed an issue. It is my personal experience that if the app is deployed to a suitably robust platform (Google App Engine is a notable one), it may continue to run unsupervised for many years.

But yes, I do have my own doubts when it comes to the sustainability of apps from app contests as I have stated in my review of Apps for Amsterdam.

Data quality is the largest issue on all levels and it needs to be addressed. From gathering data, to publishing it, to responding adequately to issues. Most datasets that are released for contests are not of the highest quality due to time constraints. And after the contest is over they are seldom kept up to date by the publishing office. When it comes to sustainability, government should first turn to itself and start releasing their data in a way that is sustainable.

Besides releasing the data in a proper format, a very important consideration is the licensing. Re-using data should happen under conditions as liberal as possible (CC0 preferred) as not to deter companies from investing in using that data.

But even then creating apps that are successful and sustainable at scale may be too lofty a goal. Productizing apps in a professional way implies conceiving, building and expanding a startup company. If one or more such initiatives come out of a hackathon that may be called a resounding succes. But what of the rest?

Well, communities of practice are built on exactly that: practice. Data does not overnight become readily at hand and usable. It takes a lot of hard work from all of us.

Having organized several hackdays we are seeing an increase in number of people attending and their proficiencies as well as a wider awareness of the possibilities of data in journalism, government and politics. Those are exactly the things we need if we want to make open data (and not just applications) the foundational fabric of our information society.


NoGIS on a fortress: Apps for Noord Holland

Posted: August 18th, 2011 | Author: | Filed under: Events | No Comments »

I have written here before about the need for web developers to learn more about GIS technologies and how to either work with or work around the traditional geographical software packages and data formats. There is a lot of synergy to be achieved in working together.

In the summer lull over at Hack de Overheid we are organizing a day of programming at a fortress which in itself already is a unique event: Apps for Noord Holland. But during the day the people from ESRI will give a workshop about geo data which we think is very worthwhile for any programmer who wants to get started in this field.

So if you want to spend a day on a fortress learning about GIS and programming, go right ahead and register. It promises to be a terrific day.

Harbour