Guardian Hackday
At the end of July the Guardian held an internal at their offices in King's Cross. They invited two engineers from ´óÏó´«Ã½ Radio's A&Mi department, and David Rogers. We teamed up with Leigh Dodds & Ian Davis from Semantic Web specialists, to produce an 'Interactive-MP-Media-Appearance-Timeline' by mashing up data from ´óÏó´«Ã½ Programmes and the Guardian's website.
Before the event extracted data about MPs from the and converted it into a . This store contains data about every British MP, the Guardian articles in which they have appeared, a photo, related links and other data. Talis also provide a endpoint to allow searching and extraction of the data from the store.
Coincidentally, the ´óÏó´«Ã½ programmes data is also available as a linked datastore. By crawling this data using the MP's name as the search key we were able to extract information about the TV and radio programmes in which a given MP had appeared. A second datastore was created from the combination of these two datasets, and by pulling in some related data from . Using this new datastore we created a web application containing an embedded visualisation of the data.
We created the web using the lightweight ruby web-framework . A simple schema provided access to a web page showing basic information about an MP.
In addition we queried the datastore to give a list of all of the MPs appearances across Guardian and ´óÏó´«Ã½ content. This was returned as a JSON document, and passed to an embedded Java applet. A Java applet may seem like an unusual choice in 2009, but Processing is an excellent choice for the rapid development of responsive graphics applications, due to its integration with existing JAVA libraries, and its powerful graphics framework.
Leigh at Talis put together a showing the app in action. The Processing applet shows a month-by-month scrollable timeline. The user can move back and forward in time, at variable speeds, by pressing the mouse either side of the applet frame. In each month slot, a stack of media appearances is displayed, represented by the logo of the ´óÏó´«Ã½ brand, or in the case of Guardian articles, the Guardian brand. Moving the mouse over a media appearance reveals the headline or programme description and clicking a media appearance will navigate the browser to the episode page on the /programmes or the article page on .
We demonstrated the application to the hackday audience, and in the prize giving ceremony were awarded the 'Best use of third-party data' award. We think that the application demonstrates some of the ways the structured RDF data provided by ´óÏó´«Ã½'s /programmes website can be used. This project shows how powerful the linked-data concept is when used in conjunction with other data that has been exposed in a similar way. As more media organisations expose their domains in this manner, more interesting and wide-reaching visualisations and web-applications can be built.
David Rogers is a Software Engineer, Future Media & Technology for ´óÏó´«Ã½ Audio & Music & Mobile.
Comment number 1.
At 25th Aug 2009, H_the_eagle wrote:So is this data freely available for anyone to use, and if so is anyone making use of this, and/or similar data from the bbc or guardian, if so can anyone give examples of such mash sites.
Complain about this comment (Comment number 1)
Comment number 2.
At 27th Aug 2009, juicyzcl wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 2)
Comment number 3.
At 27th Aug 2009, daverog2 wrote:All the ´óÏó´«Ã½ data is publicly available on the website (including the RDF, JSON and XML resources) e.g. /programmes/b00lyx6c.rdf
The Guardian data was pulled from the content API which is also publicly available, but you need to get an API key first and then you'll be limited to 5,000 requests a day. The content API is still in beta and The Guardian may introduce further restrictions. See
Complain about this comment (Comment number 3)
Comment number 4.
At 2nd Sep 2009, wwdlu101 wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 4)
Comment number 5.
At 30th Mar 2010, U14402580 wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 5)
Comment number 6.
At 12th May 2010, U14460911 wrote:All this user's posts have been removed.Why?
Complain about this comment (Comment number 6)