1 / 15

Semantic Technology Applicability to CH

Ontotext Experience in Cultural Heritage Bulgariana Collections in Europeana Vladimir Alexiev, PhD, PMP Mariana Damova, PhD. Semantic Technology Applicability to CH. Best way to interconnect data. If the Web (1.0) is a giant hyper-linked document, Semantic Web (3.0) is a giant linked data-base

Download Presentation

Semantic Technology Applicability to CH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Ontotext Experience in Cultural HeritageBulgariana Collections in Europeana Vladimir Alexiev, PhD, PMPMariana Damova, PhD

  2. Semantic Technology Applicability to CH • Best way to interconnect data. If the Web (1.0) is a giant hyper-linked document, Semantic Web (3.0) is a giant linked data-base • Unified,globalizedand abstracted representation (RDF, RDFS, OWL2, RIF). Schema info (metadata) is represented the same way as data • Ontologies and schemas ensure metadata interoperability (ESE, EDM, LIDO, CIDOC CRM, EADS, MODS…) • Linked Open Data provides additional context (DBpedia, GeoNames, FreeBase, WordNet, …) • Thesauri ensure consistent vocabulary (Getty ULAN, AAT, TGN; IconClass, VIAF, etc) • adopts semtech for all future development (EDM). First White Paper "Knowledge = Information in Context" looks at the key role of LOD • "Linked data gives machines the ability to make associations and put search terms into context. Without linked data, Europeana could be seen as a simple collection of digital objects. With linked data, the potential is far greater" Ontotext experience in CH; Bulgariana collections in Europeana

  3. Ontotext • Ontotext is a Bulgarian company with 65 staff: Sofia, Varna, Ruse, Asenovgrad, Innsbruck (AT), London (UK), Connecticut (US), Wellington (NZ) • Started in 2000 as a research lab in Sirma Group. Spun off in 2008 with investment from NEVEQ • World-leader in semantic technologies. 360-degree semtech: repository (OWLIM), text mining (KIM, GATE), web mining (WMF), Ontology and Linked Data Management • Most successful Bulgarian participant in EU FP 5,6,7 research projects (16 completed, 7 in execution). Received the prestigious Pitagoras award • Revenue growth in the last 3 years: 210%. 5M BGL in 2011, over 7M expected in 2012 Ontotext experience in CH; Bulgariana collections in Europeana

  4. Commercial Projects • Commercial revenue grew 10x in last 3 years and is close to 2/3 of total • Data providers 27% (jobs, food, cars), Media/Publishing 26%, Government 18%, Life Sciences 11%, Cultural Heritage 10%, Telecom 4% • Technical topics range from core semtech to ontology design, master data management, web services, SOA, business processes, eGov, etc etc • UK 59%, US 18%, Global 9%, BG 7%, IT 3%, KR 2%, MX 2%, DE, NL • Regular SemTech training courses in London • Great potential in Cultural Heritage so we want to focus on that Ontotext experience in CH; Bulgariana collections in Europeana

  5. Clients Related to Media and Cultural Heritage • Project clients: UK, KR, JP, SE, NL, BG • Research projects executed by Ontotext • Projects using OWLIM: EU, PL, JP, UK Ontotext experience in CH; Bulgariana collections in Europeana

  6. Projects Related to Media and Cultural Heritage (1) • British Broadcasting Corporation (BBC): Dynamic Semantic Publishing. The World Cup (2010), BBC Sports (2011) and Olympics (2012) multi-sites run on top of OWLIM. KIM-based Concept Extraction • Press Association (UK): commercial image annotation and search, Concept Extraction • The National Archives (UK): Semantic KB and search for Government Web Archive. 780M documents (150M after de-duplication), 10B facts • British Museum (UK): ResearchSpace project funded by Mellon Foundation (US): Collaborative web-based research for the cultural heritage scholarly community. Based on the CIDOC CRM ontology • de Bibliothek (NL): data aggregation from 150 national/local sources to semantic format, unified search (40M objects) • National Institute of Informatics (JP): Linked Open Data in Academia (LODAC): aggregates museum and other data across multiple Japanese resources

  7. Projects Related to Media and Cultural Heritage (2) • Polish Digital National Museum (PL): aggregates artifacts from 70 contributing cultural institutions • PrestoSpace (FP6): Preservation towards storage and access. Standardized Practices for Audiovisual Contents in Europe. Continuation: PrestoCenter.org • MOLTO (FP7) : Multilingual Online Translation. Knowledge infrastructure, interoperability between natural language and structured queries, museum object descriptions in 15 languages. Based on the CIDOC CRM ontology • Gothenburg City Museum (SE): 9K museum objects for use case of CH knowledge representation that allows querying and presenting semantic search results in natural language. • Bulgariana (BG, KR): a Bulgarian aggregator for Europeana, including digital repository for CH objects, semantic conversion (ESE, EDM), submission to Europeana, and community building

  8. Bulgariana • A Bulgarian aggregator to Europeana that includes • A public website for sharing information • A wiki (Confluence) for discussion, technical materials, coordination and collaboration • A digital repository (DSpace) for storing and presenting digitized cultural heritage • Conversion/ingestion tools for converting objects to the required Europeana formats: ESE and EDM (pilot) • An OAI-PMH endpoint for serving content to Europeana • Semantic search using OWLIM (in the future) • Partners • BG-KR IT Cooperation Center: initial funding • Ontotext: initiative, semtech, Europeana contact • Sirma Media: digital repository • And we want you! Ontotext experience in CH; Bulgariana collections in Europeana

  9. Collaboration and Networking • Google Group "Cultural Heritage Digitalisation" • Jointly created with SU FMI in Oct 2011 • 40 members, 80 messages in 5 months (still not a lot of activity…) • Meetings • 20121010: joint MS program in Digitalization (IMI BAS, UNIBIT). Welcome by Ontotext, proposed to use Bulgariana as a platform • 20120119: Restart(?) of expert working group (Ministry of Culture) • 20120130: Europeana1 "Mission Possible" (Ontotext, Sofia University) • 20120319: Europeana2: “Bulgarian projects for digitalization and presentation of cultural heritage Europeana" (V.Tarnovo Regional Library) • All presentations and contacts are published • 20120305: “Workshop on Multilingual Digital Repositories and Services” (Sofia: ITD, VirtSOI, DSLL, ATLAS, Share.TEC) • 20120918: "Digitalization, Preservation and Presentation of Cultural and Scientific Heritage" (DiPP 2012, organized by IMI BAS, hosted by V.Tarnovo Library) • 201211xx: Europeana3 (Varna Regional Library) Ontotext experience in CH; Bulgariana collections in Europeana

  10. Current Proposals • WSR4Europeana (web science research for Europeana): FP7 People (Marie Curie) Initial Training Network (Multi-Partner ITN). Doctoral research, exchanges, training • Partners: Humboldt U (DE), Tampere (FI), Aalto U (FI), FORTH (GR), RSLIS (DK), U Mannheim (DE), VU Brussel (BE), NTUA (GR), Ontotext (BG), Seme4 (UK), Net7 (IT), • Associated: Europeana (NL), CNR ISTI (IT), U Carlos III (ES), CCS (DE), Tufts U (US) • Emerging fields, incl. semantic repositories for Europeana, semantic annotation • SmartCulture: Regions of Knowledge 2012 cluster of clusters • International: Madrid, Basque, Birmingham, Siena, Eindhoven, Central Denmark, Sofia • BG cluster: Sofia Development Organization, Sofia University, UNIBIT, IMI-BAS, Ontotext, Tetracom, DSLL • Ontology-based Digital Platform for Knowledge Sustainability: ICT Call9 • U Lyon, invited GeoCad93. Tentative • Balkan Wars: PSP Call6 • Idea by PrimaSoft/SoftLib. Interest VTU, V.Tarnovo library , IMI BAS, Plovdiv Library • Need 6 international partners: Turkey, Serbia, Macedonia, etc.Tentative • Geographical Regions : PSP Call6. BAS, Austria, Romania, GeoCad93. Tentative • Slavonic Manuscripts: PSP Call6. BAS, tentative Ontotext experience in CH; Bulgariana collections in Europeana

  11. Bulgariana Wiki Ontotext experience in CH; Bulgariana collections in Europeana

  12. Bulgariana Collections (1) • Pra-historic and Thracian Civilizations • Unpublished Thracian archeological objects. Prof. Valeria Fol, Center of Thracology at the Institute for Balkan Studies at the Bulgarian Academy of Sciences Ontotext experience in CH; Bulgariana collections in Europeana

  13. Bulgariana Collections (2) • Golden Pages from the Bulgarian Renaissance • Unique manuscripts of Bulgarian folk songs collected in 19th century by Miladinov Brothers, published in 2008 by Dr Luchia Antonova, Institute of Bulgarian Language, BAS МАРКО КРАЛЕВИКИ БОЛЕН СЕ КАИТ И СЕ ИСПОВЕДВИТ Поболил се Марко Кралевике, що си лежал токму три години, от нищо се иляч (1) не на’ож’ал. И му рече негва стара майќа: “Ай ти, Марко, ай ти, синко милий; не си болен, синко, от господа, тук си болен, синко, от гре’о’и, да ти викна попой (2), ду’овници, лепо да се синко исповедиш, да си кажиш твоите гре’о’и!” …. Ontotext experience in CH; Bulgariana collections in Europeana

  14. Bulgariana Collections Published to Europeana (1) Ontotext experience in CH; Bulgariana collections in Europeana

  15. Bulgariana Collections Published to Europeana (2) Ontotext experience in CH; Bulgariana collections in Europeana

More Related