Digital Public Space: Turning a big idea into a big thing
The remains of the Metroon. Photo by , used under licence.
The story begins about 2,500 years ago, in Athens.
Around 500 BC in the Ancient Greek city-state of Athens the state archive was housed in a building called the , or 鈥榤other building鈥. This temple, dedicated to the , was filled with papers relating to the day-to-day civic, legal, commercial and cultural life of its citizens.
The Metroon was open to every citizen, and all were entitled both to read and to make copies of anything held there, giving them a level of access to the building blocks of their society that is unrivalled in the modern age despite our Freedom of Information laws and open data initiatives.
Today, there are simply too many Metroons, even if we had permission to enter all of them. The vast majority of current archives remain undigitised and available only by visiting a physical building.
But the bigger challenge is that public archives are run by organisations and institutions that have collected, crafted and labelled their archives to meet their operational needs rather than general access. Gathering a comprehensive or authoritative set of materials from different archives becomes an Olympian task: there is simply so much stuff in so many places described and recorded in different ways using different systems.
Digitisation does not solve this problem, since the different databases are not necessarily compatible. Even though we are entering an age of , the challenges in achieving Athenian levels of access to digitised material are many, massive and varied.
Fortunately, the emergence of the semantic web and linked data practices and standards means that a modern Metroon 鈥 a digital public space 鈥 is, at last, a possibility.
Digital Public Space
I'm Jake Berger, the Programme Manger for the Digital Public Space project. Essentially this means that I have to work out and describe the scope and challenges of the overall vision, then try to break down the big challenges of the wider Digital Public Space project into smaller ones and work with my 大象传媒 colleagues and external partners, (like those mentioned by Bill earlier in the week) to get these smaller projects off the ground. I also try to make sure that our project鈥檚 thinking aligns with other thinking around the 大象传媒 and beyond
Mo McRoberts introduced the Digital Public Space project in a blog post in April.
In summary, our ambition is to create an online space in which much of the UK鈥檚 publicly-held cultural and heritage media assets and data could be found - connected together, searchable, machine-readable, open, accessible, visible and usable in a way that allows individuals, institutions and machines to add additional material, meaning and context to each other鈥檚 media, indexed and tagged to the highest level of detail.
A couple of weeks ago, . When Jemima Kiss described DPS as a big library Bill explained how the shared metadata would work:
The stuff鈥檚 not in the library. You have the best catalogue ever, and when you want something there鈥檚 an instant delivery service. Those organisations that want to keep their material in the library can do so. Those that want to keep it to themselves because they鈥檙e worried about rights issues or whatever can keep it to themselves and only make it available when they鈥檙e asked for it to people they鈥檙e sure will look after it.
Bill wrote earlier this week about how the 大象传媒 was working with partner organisations, and you can see a of how the partners, the catalogue, the assets, and the products and services all fit together on Bill鈥檚 post.
Data model and reference implementation
As Mo wrote, an 鈥榰mbrella鈥 data model is being developed. This brings together a number of catalogues - data sets describing the holdings of a range of partners - classifying their contents in a consistent way, identifying themes and types, and mapping out connections and associations across diverse data sets.
The data model is lossless - it does not attempt to truncate or simplify any of the extensive detail within partners鈥 catalogues. There may be elements or fields within each catalogue that do not currently have an obvious connection to any other field in any other catalogue, but the 鈥榣ossless鈥 approach ensures that any such connections can be found and mapped in the future.
So far the data model can encompass catalogue information from the 大象传媒, , , , , , and .
Early versions of this data model indicate that - as hoped - there will be many, varied and often unexpected journeys that can be made through these catalogues and the material they describe. For example, a user starting out by watching a film of a might then look at a scan of a rare musical manuscript from The National Archives, then browse similar manuscript scans held at the British Library, watch a clip from a 大象传媒 documentary about how paper was produced in Shakespeare's era, before ending up learning about the . In a DPS, all of this could happen in the same online space.
Clearly, a 鈥榗ritical mass鈥 of data and material needs to be brought together before we see such innovative journeys emerge. So we are now beginning to assemble this critical mass.
Screen grab from a Metabroadcast prototype to help folk navigate a large dataset.
Dealing with Complexity
The project throws up a lot of complex questions and we have already started projects to address some of them.
Bill recently mentioned a , which combined digitised video with tools to search and tag it for research or teaching.
There are a whole series of challenges around the user experience and navigation through vast and diverse data sets. We have worked with Metabroadcast to
Mo blogged about the development of a web browser-based user interface, which navigates through these catalogues using the concepts of 鈥減eople鈥, 鈥減laces鈥, 鈥渆vents鈥, 鈥渢hings鈥 and 鈥渃ollections鈥. I soon hope to share some other User Interfaces that we鈥檝e developed within the Archive Development team
Of course these are only the beginning; soon we will launch projects with other collaborating institutions which will explore issues around rights, identity, access, privacy, provenance, persistence, user鈥揼enerated content and data, augmentation and amplification.
As an insight into these issues, it will be a large task to administer huge numbers of rights holders when the current rights situation is so complex. Earlier this year a study by our Rights and Business Affairs Department, submitted to the , revealed an average of 85 separate rights-clearance transactions per episode of Doctor Who.
Another question is to track contributions, usage, and amendment whilst preserving the privacy of contributors. This leads to the question of provenance and how we can preserve this information over time.
When more projects to address these complex questions are off the ground, I will blog about them too.
Rome wasn鈥檛 built in a day. Neither was Athens, and nor will the Digital Public Space. But, I hope that the blueprint we are beginning to develop, the plan that will deliver it and the Digital Public Space itself are as valuable to every modern-day citizen as the Metroon was to the citizens of Athens.
I look forward to blogging again when these projects have delivered, as part of consulting our partners and audience over the next steps.
Jake Berger is the Programme Manager for Digital Public Space in the 大象传媒 Archive Development.
Comments Post your comment