Posted: January 1st, 2012 | Author: alper | Filed under: Parliamentary Interruptions | Tags: chord, d3, politics, Protovis, sargasso, the Netherlands, visualization | No Comments »
Last week Sargasso had procured a dataset of interruptions from politicians in our House of Representatives. With the counts from which politician had interrupted which in debates they had made some nice infographics and a couple of blog posts. I thought this was the ideal opportunity to put all of the data (aggregated by party) in the D3 example chord diagram.
Never having used D3 before this was an ideal excuse to learn it and a near ideal dataset to employ. The result is as follows (click through for the interactive version):
This was featured on Sargasso the next day.
The graphic is not directly clear, but the data is deep and interesting enough to afford some exploration and it yields insight into the behaviours of various political parties during the reign of this cabinet. And what seems to matter a lot to people: it looks quite pretty.
With regard to D3, I think I will use it more often. It works quite similar to Protovis with which we have done some stuff before, but it feels much more current. Protovis itself is discontinued in favor of D3 according to a notice on the site and D3 seems a very worth successor.
Posted: November 8th, 2011 | Author: arjan | Filed under: TIMEMAPS | Tags: API, data model, Erlang, JSON, NS, OpenOV, train, travel time matrix, Zotonic | No Comments »
The backend of the TIMEMAPS project is based on the Zotonic web framework and Erlang. This article highlights the technical challenges and concessions that were considered while building the visualization.
The NS API and its limitations
The NS, the dutch railway system provider, provides an API which allows a developer to build upon it. While it is a nice effort and opens up a lot of possible applications, we found out that for the TIMEMAPS project it was not an ideal API to work with.
But, our requirements were pretty ambitous to begin with, and, from a practical point of view, not what an API designer would call “typical”. In TIMEMAPS, given any point T in time, we need to know, for every train station, how long it takes to travel at moment T to any other train station in the netherlands. Even for a small country like the Netherlands, this becomes a pretty big matrix of travel possibilities, given that there are 379 train stations in the country.
Ideally, for every element in this matrix an API call has to be done to get the actual planning.
Given that the NS API only allows an app to do up to 50.000 requests per day and we did not want to hammer the already stressed API servers too much, we needed to come up with a solution, while not sacrificing the real time aspect too much.
An open source travel planner..?
Another API call that the NS offers are the “Actuele vertrektijden”: given a station, return the 10 first trains that depart from it. It returns also the train numbers: a “unique” number which is assigned to a train on a single trajectory for the day (it might be re-used though in time). By linking the departure times from different stations through this train number, it should possible to see when a train that departs from A passes through B, if it is on the same trajectory.
However, some drawbacks popped up while implementing this approach.
- For long trajectories (>1h) this approach did not work since the arrival station did not yet list the departure of the train you departed on since it was too far in the future
- There was no API call for arrival times for trains on stations: this made it impossible to take the stopover-time into account and it was not possible to use this planning mechanism for destinations on the very end of the trajectory (e.g., no departure listed for the arriving train)
- Doing a “naive” planning this way takes a considerable amount of database processing power as each stopover adds 2 self-joins to the database query, thus increasing exponentially in complexity.
Scraping of the departure times gives a increasingly complete graph of the railway system, and this graph, combined with the geographical location of stations might be used in a search algorithm to make an offline planner. For me however this aproach was too far of a longshot for the already pretty complex project so I decided to put this approach in the fridge for now.
However, this effort has brought me in contact with the OpenOV guys who are dedicated to liberate all public transportation data in the Netherlands. In the future, I hope I can contribute something to their wonderful initiative.
Doing consessions
Luckily, the TIMEMAPS project had one “business rule” with respect to its visualization: only stations that are near the border of the map are allowed to modify the map. That made the list of stations considerably smaller: after selection there were 60 stations left.
However this limited the practical application of the map in that some of the displayed travel times are not accurate: for the remaining, smaller / non-border stations we chose to interpolate the travel times between the “main” stations: an inaccuracy, given the fact that it often takes longer to travel from a minor station (e.g. Eindhoven Beukenlaan) to any other city. But for the sake for the clarity of the visualization, we agreed on this concession.
Data model & worker processes
There are two worker processees running in the background.
One process constantly (approximately 1 request per 1.5 second) queries the NS API for any A → B trip that has no planning in the future. This process favors distance: it tries first to find plannings for longest A → B trajectories, since the NS API also returns every timing information for intermediate stops, allowing to get more than one planning per API request. This planning information is stored in the database and kept for at least a week.
Table "public.static_planning"
Column | Type | Modifiers
--------------------+-----------------------------+------------------------
id | integer | not null
station_from | character varying(32) | not null
station_to | character varying(32) | not null
time | timestamp without time zone | not null
duration | integer | not null
ns | boolean | not null default true
fetchtime | timestamp without time zone |
spoor | character varying(32) |
aankomstvertraging | integer |
vertrekvertraging | integer |
Another process constantly queries the Actuele Vertrektijden API for every station (not only border stations). This information is used for the “fallback” scenario of step 3), in which no real planning is found for the station combination and we fall back on a fixed travel time, but do include the scraped departure time.
Table "public.vertrektijd"
Column | Type | Modifiers
----------------+-----------------------------+-----------
station | character varying(32) | not null
time | timestamp without time zone | not null
vertraging | integer | not null
ritnummer | integer | not null
eindbestemming | character varying(32) |
fetchtime | timestamp without time zone |
Building the travel time matrix
The current map exposes an API to the N^2 matrix of the current time at the URL /api/reisplanner/actueel. It is a JSON long list where each entry looks like this:
["std",
"amf",
"2011-10-30 13:13:00",
7440,
"2b"]
This particular entry shows that the next train from Sittard (std) to Amersfoort (amf) leaves on 13:13h, from track 2B and takes 7440 seconds (2 hours and 4 minutes). For every station to another station (for the “border stations”) there is an entry in this list.
A second URL, /api/reisplanner/history?date=2011-10-29T22:00:00Z, gives this list for a certain date in the past.
Given the fact that we were unable to query every planning in real time, these results are build up in a three-step phase:
- Given each station A, B, check if there has been a planning retrieved for A → B for which the start time is in the future. Return the planning that is closest to the current time.
- Failing condition 1), check if there has been a planning retrieved for A → B last week. Return the planning that is closest to the current time minus 7 days. We assume that for every day of the week, the planning is the same. Note that this does not hold for holidays / festive days.
- Failing condition 1) and 2), return the planning tuple in which we assume a constant, pre-fetched travel time (a static matrix for times between A and B without time information). We assume that the first train leaving for A is the right train for getting to B.
A combiner algorithm retrieves for every station-to-station combination the results from step 1, otherwise those from step 2 and as final fallback step 3 (which always has a result, although it might not be accurate).
mod_reisplanner – the module making all this happen
Above processes have all been implemented in Erlang as a module for Zotonic. It will be open-sourced soon, so that it hopefully can serve as a basis and/or inspiration for other applications using Erlang and the NS API.
This article is the second in a series about the TIMEMAPS project. TIMEMAPS’ concept and design are by Vincent Meertens, the implementation is by Arjan Scherpenisse.
Posted: November 7th, 2011 | Author: alper | Filed under: TIMEMAPS | Tags: Artificial Intelligence, Erlang, Miracle Things, Rietveldacademie | No Comments »
It is my pleasure to introduce here on Monster Swell a new collaboration and a spectacular piece of work. Arjan Scherpenisse of Miracle Things will be collaborating with us in the field of data visualization.
Arjan is that rare breed of artist né programmer formally trained in both but picking neither side. He is active on the most innovative edge of software as well as building physical interaction projects and schooling others in programming be it in Erlang or some other language.
The TIMEMAPS project written up just before this post is the first of we hope many forays into data visualization for Arjan and we look forward to collaborate on many such projects in the future.
Posted: November 7th, 2011 | Author: arjan | Filed under: TIMEMAPS | Tags: blitting, Canvas, Dutch Design Week, HTML5, map deformation | No Comments »
TIMEMAPS visualizes how the map of the Netherlands would look if it would be scaled proportionally to the travel times (by train) between cities. I was asked by the designer of the concept, Vincent Meertens of graphsic, to transform his manually crafted PDF files into a real-time, interactive visualization. TIMEMAPS has been exhibited at the Graduation Show event during the Dutch Design Week 2011.
The map is a real-time interactive map. Clicking a city allows one to set the perspective to a city of his choice. Hovering the map shows a pop-up which highlights the time it takes to travel to the city the mouse is currently over. Every coloured “ring” on the map denotes 30 minutes of travel time, at the current time.
Drawing the map with Canvas
The visualization is done using the HTML5 canvas. Why canvas, and not just SVG, one would ask? Good question: I wanted to learn more about the canvas and thus was a bit biased. I think the project could have been done with SVG as well.
The map consists of a set of polygons: the outline of the Netherlands and its various islands. All the cities are located on those shapes with all 379 train stations. Furthermore, there are several bridges between the islands, like the big “Afsluitduik”, which each connect 2 vertices of the polygons.
The initial, un-transformed shape of the country and the station positions is the same as that on the famous yellow overview map that the NS uses in the stations: it is a schematic view of the Netherlands, constrained in a grid of 0, 45 and 90-degree lines.
The drawing algorithm first draws all the polygons and bridges, and subsequently fills those areas with a pattern of colored concentric circles. This is done in canvas by blitting the previous shape with a pre-rendered image of the circles using the compositeOperation method. The distances between the circles are scaled to represent 30 minutes of travel time. Then, the cities are drawn as big/small dots (main stations are bigger) and connected to the current city by a thin white line.
The information hovers (a plain HTML div) are done by using the “mousemove” event on the canvas and calculating which city is the closest to the current mouse location. Clicking a city causes the current perspective to shift to the clicked city in an animated fashion, using a simple (cosine) transition.
Map deformation
The angles at which cities view each other are kept constant. So, for example, viewed from Rotterdam, Utrecht centraal is always at a 45-degree angle, regardless of the time it takes to travel from Rotterdam to Utrecht. The actual city location is scaled proportionally along these angles: if it takes less time to travel, the city is pulled closer; if it takes more time it is pushed further away. But the angle remains constant.
The polygons (that make the actual shape of the map) are “magnetic” and each vertex “sticks” to the cities it is initially closest to, in a weighted fashion. This algorithm is loosely based on the article “Feature point based mesh animation applied to MPEG-4 facial animation”.
For the islands, this mesh-stretching was mixed 60%/40% with a simple vertex displacement to prevent the islands from becoming unrecognizable: since there are no stations on islands, they are prone to more deformation since the feature points (cities) lie further away.
Problems in the visualization
The shape of the map sometimes is deformed beyond recognition because in certain cases cities which are normally close are being pushed away beyond cities that are normally far away: thus causing the polygon to turn “inside out” and cause cities to appear to be located in the sea.
Another issue is the 45-degree grid constraint: the mesh stretching algorithm does not take this into account because this constraint is applied in a later calculation stage: this sometimes causes cities to be located in the sea as well. A temporary solution for this was to add more vertices to the polygons so the map had more flexibility while stretching.
Application to other maps
The Netherlands is a pretty ideal country in the way the transportation system is organized: viewed from the center, “de Randstad”, or Utrecht or Amersfoort, it is indeed so that travel times do increase almost linearly with geographical distance. I do not think this holds for every country: especially with the advance of faster railways (the fast Fyra train was not taken into account in our implementation!), the map might deform in ways that are beyond recognition and beyond representation in the 2D domain.However it might be an interesting experiment to apply the same techniques to a different country.
This article is the second in a series about the TIMEMAPS project. TIMEMAPS’ concept and design are by Vincent Meertens, the implementation is by Arjan Scherpenisse.
Posted: September 5th, 2011 | Author: alper | Filed under: Foursquare Map | Tags: Amsterdam, display, entertainment, foursquare, Google Maps, Leidseplein, night life, video | No Comments »
Finally got around to go the AUB Ticketshop at Leidse Square during the daytime to view the Foursquare Display we setup in action (previous blog post).
A video of the screen:
The screen in context:
It is a welcome refresher from the static posters and the static videos that usually litter these high profile locations. The foursquare coloured view of the area is always fresh and shows a view on the local flavour and the people that visit the venues around.
From an urban development point of view it may be odd to draw more attention to the already highly crowded Leidse Square area. But it comes to reason that new developments such as these will be tested on high density locations first. We would be very interested to create augmentations in public space to make locations in Amsterdam’s periphery more appealing.
Posted: June 28th, 2011 | Author: alper | Filed under: Statlas | Tags: cartography, mapping, Polymaps | No Comments »
It’s been some time in the making but today we are proud to do a very early beta release of Statlas, the project we have been working on these past months. The Dutch Press Innovation fund funded this project and we collaborated with Fluxility and Alexander Zeh on this version. So please do check out: Statlas
There are several similar tools out there that help you create your own map but we feel that they are not as easy as they should be and most all of them are created in Flash. Statlas is built on Polymaps and therefore fully compatible with the open web. Creating a map is a simple as painting by numbers.
Our initial explorations set us on our way to create the easiest and most generative atlas tool we could imagine. Statlas is setup to allow you to choose a group of regions and for each of those regions enter a value (numerical, color or other) to create a map coloring. That map can then be shared, printed, embeded wherever you want . But anybody can also take a public map and edit it to improve upon existing data or to express their differences with them. It is also possible to export data to CSV, use other tools to collect statistics and re-import them back into Statlas.
Feedback
This initial release is geared towards the Dutch context as we have been developing it with the Netherlands in mind first. We are going to quickly add more regions and we are solliciting requests for regions you may want to add. If you have ideas, requests and or Shapefiles, please send them our way so we may add them.
This is a most preliminary beta release of a functional piece of software. We are envisioning much more data heavy and live updating views in the near future, but a project of this scope can balloon too easily. We’ve heard no end of people who wanted to use it for one cause or another and we wanted to show something first. After this release we’ll see which direction is most in demand of pursuing.
Posted: May 4th, 2011 | Author: alper | Filed under: Foursquare Map, Projects | Tags: Amsterdam, cache, entertainment, foursquare, Google Maps, Javascript, Leidseplein, mapping, OAuth, PHP | 1 Comment »
For the Amsterdam UIT Bureau and I Amsterdam we created this Foursquare map designed to display nightlife activity around the Leidseplein (entertainment) area with recent checkins, specials and current mayor and photographs of a selected group of venues. We strongly believe in creating autonomous displays that take cues from the environment —in this case using Foursquare— and deliver clear actions to the audience as well as a sense that the area they are in is alive and all they have to do is go out and connect to it.
The project is live at its own URL and in an iframe on the IAmsterdam site.
Technically we used Foursquare’s OAuth2 API which is outstanding. To be able to share one token across all requests we employ a file based PHP cache that relays the necessary requests for us. Main technology was created in collaboration with Panman Productions.
Posted: April 28th, 2011 | Author: alper | Filed under: PvdA Canvassing | Tags: campaign, elections, Google Maps, politics, Protovis, pvda, sentiment | 1 Comment »
We ran a major update to the previous concept we did for the Dutch Labour Party using their canvassing results for the previous elections. The previous version crammed all the interaction into a tabbed balloon on a Google Map. This update turns that inside out and creates a full blown site called: “PvdA – Altijd in de buurt”.
The site shows canvas results tallied per city to show the biggest positive and negative issues according to constituants and their perception of politics.
It got some attention on various weblogs: Arnhem Direct, Sargasso, PvdA.nl, Johnny Wonder
The potential for a data driven approach to politics is tremendous. A site like this in effect gauges the sentiment in any given locality and in an ideal scenario it would also give people and politicians ways to collaborate to improve the situation. Any improvement realized can then be recorded and used to rally voters at subsequent elections.
Posted: April 25th, 2011 | Author: alper | Filed under: 75 Social Scientists | Tags: de Groene Amsterdammer, journalism, Protovis, social science | 1 Comment »
An exploratory project for the Dutch weekly de Groene Amsterdammer (yes: the Green Amsterdamer) concerning a survey posed to a large number of social scientists asking their assessment of the most important problems troubling the Netherlands currently.
As an end result 75 submissions were returned with answers in essay form detailing the biggest problem of the Netherlands, the most overblown issues and the most unnoticed issues according to the scientists. This made for a very large amount of textual content which would have been difficult to quickly get into.
We chose to see how quickly we could hook up Protovis to visualize the key issues according to each scientist. All of the essay style answers were clustered to a set of themes (by the people preparing the story) and this was input to Protovis’s bubble chart to give a tag cloud like representation of the issues. See the interactive chart on Groene.nl or the screenshot below:
The quick visual summary and the filters help drill down to a specific issue in a specific problem category quickly. Clicking a bubble displays links to the full text contribution of the relevant scientists.
This was mostly a process exploration to see how a default library such as Protovis could be employed in a journalistic context and to see where the bottlenecks fall. We found that Protovis’s explanatory power really shines if you have a good dataset. However it took some time to get the data machine-ready. The result was produced efficiently and adds a much needed visual summary to the slew of textual content. Most time was spent on wrangling the dataset and finalizing the interaction details of the chart.
The project got a fair amount of attention in national media (and links to the chart) e.g.: ‘Integratie meest overschatte probleem van deze tijd’ , ‘Wat zijn de 10 grootste sociale problemen van Nederland?’
Posted: February 25th, 2011 | Author: alper | Filed under: Events, Statlas, Talks | Tags: conference, devcamp, Hack de Overheid, infographics, lecture, Statlas, visualization, Willem de Kooning | No Comments »
Some minor updates from the studio.
Statlas is going into full production.
The past weeks Alper has been giving lectures at the Willem de Kooning design academy on the subject of data visualization. The students should be busy creating their projects these coming weeks and we eagerly anticipate their results.
Hack de Overheid which we are co-organizing is going into full swing with the annual developer event on March 12th in Amsterdam (more on which in a separate post).
We will be represented at the Cognitive Cities conference in Berlin this weekend to talk about city data visualization. And next week we’ll be at the Infographics conference trying to talk some sense into those that think print is the end all of data.