1 / 19

Antoine Isaac Europeana – VU University Amsterdam

Antoine Isaac Europeana – VU University Amsterdam. Dagstuhl Multilingual Semantic Web seminar. Europeana. “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament. 24 M objects ( images, text, sound and video)

Download Presentation

Antoine Isaac Europeana – VU University Amsterdam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar

  2. Europeana “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament • 24 M objects (images, text, sound and video) • From over 2.200 libraries, museums, archives • From 33 countries • For everyone

  3. Multilingual Access in Europeana

  4. Dimensions of multilingual access • Interface • Search (query translation or document translation) • Result presentation • Browsing

  5. Europeana's efforts • Interface translated into 26 languages • Query translation: only prototype • Query result filtering by country/language • Document translation (user enabled) • Semantic contextualization of objects • Multilingual enrichment/annotation of metadata

  6. Making metadata work for multilingual access

  7. Current metadata in Europeana • Simple object records • Flat (text values) • Without language tags! • Only language-related info on metadata is at collection level • Can be "mul" Need to change! • a new Europeana Data Model (EDM)

  8. "Semantic layer" of contextual resources(concepts, persons, places, events...) Cultural artefact Buildling Sculpture Painting Networked objects • Exploiting semantic relations • e.g. “broader concept”, “place of birth”, “involved person”…

  9. Multilingual metadata

  10. Fetching already available linked data E.g., from libraries http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset/

  11. Interoperability • Encouraging the use of RDF + common and simple elements

  12. Interoperability • Encouraging the use of common and simple data elements <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2308"> <skos:prefLabel xml:lang="fr">Piano carré</skos:prefLabel> <skos:prefLabel xml:lang="it">Pianoforte a tavolino</skos:prefLabel> <skos:prefLabel xml:lang="en">Square pianoforte</skos:prefLabel> <skos:prefLabel xml:lang="de">Tafelklavier</skos:prefLabel> <skos:prefLabel xml:lang="nl">Tafelpiano</skos:prefLabel> <skos:prefLabel xml:lang="sv">Taffel</skos:prefLabel> <skos:broader> <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2273"> <skos:prefLabel xml:lang="en">Pianofortes</skos:prefLabel> </skos:Concept> </skos:broader> </skos:Concept>

  13. Interoperability • mixed nature of eligible contextual resources: dictionaries, synonym/translation lists, thesauri, authority lists, gazetteers… • interplay: “semantic” data next to multilingual data

  14. Simultaneous approaches • Getting richer semantic/multilingual metadata from providers • Fetching third-party contextual data and linking it to “un-contextualized” objects • Linking contextual data from an institution to another more general / more commonly used contextual dataset • Dbpedia.org, VIAF.org…

  15. Status and challenges

  16. Current status • All this is work in progress and will take time R&D prototypes (EuropeanaConnect) showing the challenges of gathering appropriate multilingual tools and data • First tests of simple techniques in production portal: GeoNames (places) and GEMET (concepts) Encouraging, but illustrate issues with too naïve approaches (no NLP) and incomplete data • Cheval • Poison http://www.europeana.eu

  17. Problems & requirements For providers & Europeana • Continue work on metadata • Benchmarking (cf. CHiC lab@ CLEF) • Positioning as consumers and contributors of data (cf Asun’s slides) data.europeana.eu For language-intensive tools and resources • Availability: open resources • Interoperability • Simplicity • But not always! E.g., not only “first hit” translations • Scale: scalability of tools, number and scope of datasets • Many languages, some lesser-resourced (wrt. English)

  18. Another illustration: VOICES projectSomething entirely different but not completely unrelated Voice-based community-centric mobile services for social development • Easing communication on agricultural trade • Listing of products/prices via phone/radio • Pilot in Mali Challenges • Data-centric project, but language technology plays a crucial role • Objects should be provided with textual and audio labels (text-to-speech system) in different languages • Local languages: e.g., Bambara • Lack of resource: need low-cost, easy-to-adapt solutions Victor de Boer, VU Amsterdam (v.de.boer@cs.vu.nl)

  19. Thank you aisaac@few.vu.nl http://www.few.vu.nl/~aisaac/ Some slides based on Marlies Olensky and Juliane Stiller - Multilingual Web Workshop, June 11, 2012, Dublin

More Related