230 likes | 340 Views
Linked Logainm. Dr. Nuno Lopes. Special guests: Dr. Sandra Collins Dr. Seathrún Ó Tuairisg. Mission Statement. DRI is an interactive trusted digital repository for contemporary and historical, social and cultural data held by Irish institutions. Funding.
E N D
Linked Logainm Dr. Nuno Lopes Special guests: Dr. Sandra Collins Dr. SeathrúnÓTuairisg
Mission Statement • DRI is an interactive trusted digital repository for contemporary and historical, social and cultural data held by Irish institutions
Funding • Exchequer funded; HEA PRTLI 5, €5.2M • RIA (lead), NUIM, TCD, DIT, NUIG, NCAD • Partners: academic, cultural, social, industry • Sep 2011 – Sep 2015
Services • Preservation • Access • Sharing, linking • Cultural & Social heritage
Story telling Platform Policy Educational tool dri Users Content Shared Services Digitisation e-infrastructure Preservation Tools DRI Presentation
Partnership Project Place names Branch Fiontar DRI NLI DERI
Logainm.ie • The authority list of Irish place names, validated by the Place Names Branch. • Delivering a more detailed level than in DBpedia, Geonames. • Unique source of Irish language place names.
The NLI LongfieldMap Collection • The Longfield Maps are a set of 1,570 surveys carried out in Ireland between 1770 and 1840. • Currently catalogued in MarcXML, using data from Logainm, Geonames and Dbpedia. • Integrating Logainm data into their workflow.
LongfieldMap example <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield>
Linked Logainm http://lod-cloud.net/ Media User-generated Government Publications Cross-domain Logainm Geo LinkedGeoData Life sciences Logainm 10
Geographic Data Providers • DBpedia • Include latitude and longitude for geographic entities • LinkedGeoData • Export of data from OpenStreetMap • Beyond lat/lon (eg, areas as polygons) • GeoNames • Access data as RDF (download as TSV) • GeoLinkedData • Spain • Ordnance Survey • UK
Geo-Vocabularies • W3C Geo • SpatialThing, latitude and longitude • NeoGeo(http://geovocab.org/doc/neogeo.html) • Feature vs Geometry • Spatial Relations (is_part_of) • Most providers define their own
Approach • Translate Logainm database dump into RDF • Determine links to other datasets based on: • Place names • Geographical coordinates • Hierarchy of places • Evaluation of generated links • Deployment at Logainm.ie
http://data.logainm.ie/1375542 ~100,000 place names 1. Converting Logainm dump to RDF ~800,000 triples owl:sameAs foaf:name http://sws.geonames.org/2964574/ Dublin
2a. Linking based on Place Names Place Name lookup in DBpedia Airport, Dublin 7828 “Places” in DBpedia Hospital, Limerick1217
2b. Linking based on geographical coordinates • ~50,000 out of 100,000 place names in Logainm contain geographical information • According to the Irish Grid Format: • Eg: W 35619 58358 = lat: 51.77 lon: -8.93
3. Current status 1Entities of type “Place” or “Feature” 2Entities of type “Node” 3No hierarchy info • Using Silk for discovering links • Links in other datasets 4Including internal & Freebase links
Next steps • Evaluation of generated links • Golden set • LIMES vs Silk • Links to other datasets (eg, Freebase) • Publishing Linked Data at logainm.ie • OpenLinkVirtuoso • Using the data: • NLI showcase with Longfield Map Collection • DRI/NUIG Irish Language Collection
NUI Galway and the DRI • A DRI Demonstration Project that will showcase wealth of both the University’s and its external partners’ archives • We will collate, curate and contextualisesomecontent from our various collections, focusing on unique features of Ireland’s cultural heritage - on Language, Traditional Music, Folklore, indigenous maritime heritage • It will show the evolution of the Irish-language, from early audio recordings of traditional music, through the birth of RaidiónaGaeltachta in the ‘70s, to its modern manifestation in broadcast video and audio.
Why Do This? • For researchers it opens up datasets for potential research in the fields of linguistics and socio-linguistics, literature and folklore, history, social and political studies, place names, film and media, music and song • It can function as an educational resource, in language teaching, history, etc. • Good collaboration opportunities with external partners under increasing pressure to open up archives to the public • Repurposing legacy material (e.g. Raidió na Gaeltachta recordings) ensures a public appetite for archives.
How can we create a homogenous user-experience from heterogeneous data-sets? • How can we add value to a contemporary, on-line digital archive by linking to other on-line data-sets? • How can we make an Irish-language archive accessible in a meaningful way to non-Irish speakers? • How can we realise the information content in (Irish-language) audio and video, without relying solely on descriptive meta-data?