270 likes | 404 Views
Linked Data and the LOCAH project. Bethan Ruddock, Library and Archival Services, Mimas, University of Manchester bethan.ruddock@manchester.ac.uk @ bethanar. #ILI2011. Linked Open Copac & Archives Hub.
E N D
Linked Data and the LOCAH project Bethan Ruddock, Library and Archival Services, Mimas, University of Manchester bethan.ruddock@manchester.ac.uk @bethanar #ILI2011
Linked Open Copac & Archives Hub JISC-funded project (under JISCexpo - exposing digital content for education and research) September 2010 – August 2011 Staff from Mimas, UKOLN, Eduserv Additional expertise from Talis, OCLC, Library of Congress
Project Aims Put archival and bibliographic data at the heart of the Linked Data Web, making new links between diverse content sources, enabling the free and flexible exploration of data and enabling researchers to make new connections between subjects, people, organisations and places to reveal more about our history and society. Make a collection of resources available on the Web as structured data, in particular linked data, where a case can be made that it would benefit teaching, learning, research, administration and/or knowledge transfer in UK higher education Develop a prototype with instructional step-by-step demonstration and documentation to show how the structured content can be used by 3rd party tools and services Explore and report on the opportunities and barriers in making content structured and exposed on the Web for discovery and use. Such opportunities and barriers may coalesce around licensing implications, trust, provenance, sustainability and usability
Linking Open Data cloud diagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/
The Data: Copac • Merged union catalogue of the holdings of over 60 UK libraries • Over 50 million records • Consolidated records • MODS XML (not MARC) A Copac consolidated record created from 5 contributed records. Lines show how contributed records match with one another.
The Data: Archives Hub • Descriptions of archive collections from over 200 UK repositories • Nearly 25,000 descriptions – collection-level and multi-level • EAD (Encoded Archival Description)
Challenges: variance • Data from many sources – should adhere to • Standards • AARC2 • ISAD(G) BUT • Differences in implementation
Challenges: Data 260 $b: unknown dct:publisher: unknown dct publisher: definition: ‘entity responsible for making the resource available’
Challenges: multiple sources A ‘match graph’ of a consolidated Copac record
Challenges: Vocabulary collected relates to ORIGINATION Stuff created created collected relates to
Licensing • Data comes from contributors • Not ours to redistribute! • Concerns • Provenance • Trust • Control • Consulted • Liaised with contributors and stakeholders • Result • Released test data set as CC0
The Techy Stuff Specifications required a lot of brainstorming… Image used under a CC licence from http://www.flickr.com/photos/blankdots/4865831504/
Archives Hub Model in Finding Aid Place PostcodeUnit Repository(Agent) administeredBy/administers maintainedBy/maintains encodedAs/encodes hasPart/partOf EAD Document accessProvidedBy/providesAccessTo Level Biographical History topic/page hasBiogHist/isBiogHistFor level Language ArchivalResource language at time topic/page origination hasPart/partOf TemporalEntity Creation product of associatedWith extent inScheme Extent ConceptScheme Concept Agent representedBy Object Is-a foaf:focus Is-a associatedWith Person Family Organisation Place Book participates in Genre Function Birth Death TemporalEntity at time
data.copac.ac.uk data.archiveshub.ac.uk
Linking BBC:Cranford Copac:Cranford VIAF:Dickens DBPedia: Gaskell Hub:Gaskell Geonames:Manchester DBPedia: Dickens Hub:Dickens
Challenges: Anonymous Anonymous Anon. anonymous Anonymous Anonymous Anonymous Anonymous anon. Anonymous Anon. anon anonymous Anonymous Anonymous anon. anonymous Anon. Mask image used under a CC licence from http://www.yourbdnews.com
data.copac.ac.uk/doc/ bibliographicresource/ 6947473 data.copac.ac.uk/doc/ concept/agent/6947473 lacywilliam
data.copac.ac.uk/doc/ bibliographicresource/ 6947473 data.copac.ac.uk/doc/agent/rys
data.archiveshub.ac.uk/doc/archivalresource/gb1086colour data.archiveshub.ac.uk/doc/concept/unesco/ photography
Visualisation Prototype • Using Timemap – • Googlemaps and Simile • http://code.google.com/p/timemap/ • Early stages with this • Will give location and ‘extent’ of archive. • Will link through to Archives Hub
What Next? • Linking Lives • name-based approach into the data • integrating archival resource with other resources • DBPedia, VIAF, Copac... • route into archives for different audiences? • issues around trust and provenance to be explored
Finally… The LOCAH data is open for use… …please play with it! Image used under a CC licence from http://www.flickr.com/photos/huladancer22/530743543/
LOCAH blog: http://blogs.ukoln.ac.uk/locah/ @bethanar bethaninfoprof.wordpress.com bethan.ruddock@manchester.ac.uk