1 / 40

Semantic annotation and search of large virtual heritage collections

Semantic annotation and search of large virtual heritage collections. Guus Schreiber Free University Amsterdam. Overview. A non-technical view on the Semantic Web Work on Semantic-Web deployment SKOS, RDFa Semantic annotation and search in virtual collections: the E-Culture example.

mikaia
Download Presentation

Semantic annotation and search of large virtual heritage collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic annotation and search of large virtual heritage collections Guus Schreiber Free University Amsterdam

  2. Overview • A non-technical view on the Semantic Web • Work on Semantic-Web deployment • SKOS, RDFa • Semantic annotation and search in virtual collections: the E-Culture example

  3. The Web: resources and links URL Web link URL

  4. The Semantic Web: typed resources and links Painting “Femme aux chapeau” SFMOMA Dublin Core creator ULAN Henri Matisse Web link URL URL

  5. Principle 1: semantic annotation • Description of web objects with “concepts” from a shared vocabulary

  6. Search for objects which are linked via concepts (semantic link) Use the type of semantic link to provide meaningful presentation of the search results Principle 2: semantic search ape great ape urang-utang orange

  7. Principle 3: multiple vocabularies. or: the myth of a unified vocabulary • In large virtual collections there are always multiple vocabularies • In multiple languages • Every vocabulary has its own perspective • You can’t just merge them • But you can use vocabularies jointly by defining a limited set of links • “Vocabulary alignment” • It is surprising what you can do with just a few links

  8. Example “Tokugawa” AAT style/period Edo (Japanese period) Tokugawa SVCN period Edo SVCN is local in-house thesaurus

  9. A link between two thesauri

  10. classes and individuals subclasses properties subproperties domain/range of properties XML Schema datatypes equality, inequality inverse, transitive, symmetric, functional properties property constraints: cardinality, allValuesFrom, someValuesFrom conjunction, disjunction, negation of classes hasValue, enumerated type RDF/OWL language constructs

  11. How useful are RDF and OWL? • RDF: basic level of interoperability • Some constructs of OWL are key: • Logical characteristics of properties: symmetric, transitive, inverse • Identity: sameAs • OWL pitfalls • Bad: if it is written in OWL it is an ontology • Worse: if it is not in OWL, then it is not an ontology

  12. W3C Semantic Web Deployment Working Groupmaking vocabularies/thesauri/ontologies available on the Web • Schema for interoperable RDF/OWL representation of vocabularies • SKOS • Publication guidelines: • URI management, representation of versions • Embedding RDF in (X)HTML pages • RDFa

  13. SKOS: pattern for thesaurus modeling • Based on ISO standard • RDF representation • Documentation: http://www.w3.org/TR/swbp-skos-core-guide/ • Base class: SKOS Concept

  14. Multi-lingual labels for concepts

  15. Semantic relation:broader and narrower • No subclass semantics assumed!

  16. Indexing a resource with a SKOS concept • primarySubject is defined as subproperty

  17. Adding semantics • Adding OWL statements • Interpretations of thesaurus relations such as narrower as subclass-of are often imprecise (but can still be useful) • Learning relations between thesauri is important form of additional semantics • Example: AAT contains styles; ULAN contains artists, but there is no link • Availability of this kind of alignment knowledge is extremely useful

  18. W3C standardization process • Input: draft specification • Collect use cases • Derive requirements • Create issues list: requirements that cannot be handled by the draft spec • Propose resolutions for issues • Continuously: ask for public feedback/comments • Get consensus on amended spec • Find two independent implementation for each feature in the spec

  19. Example issue: relationships between lexical labels • In draft SKOS spec lexical labels of concepts are represented as datatype properties • Use cases require relations between labels, e.g. “AAT” is an acronym of “Art & Architecture Thesaurus” • This is a problem because literals have no URI (so cannot be subject of an RDF property) • Possible resolutions: • Labels/terms as classes • Relaxing constraints on label property • …..

  20. Recipes for vocabulary URIs • Simplified rule: • Use “hash" variant” for vocabularies that are relatively small and require frequent access http://www.w3.org/2004/02/skos/core#Concept • Use “slash” variant for large vocabularies, where you do not want always the whole vocabulary to be retrieved http://xmlns.com/foaf/0.1/Person • For more information and other recipes, see: http://www.w3.org/TR/swbp-vocab-pub/

  21. Query for WordNet URI returns “concept-bounded description”

  22. RDFa: embedding RDF metadata in an (X)HTML file Regular HTML HTML with RDFa Resulting RDF statements

  23. More information

  24. E-Culture demonstrator • Part of large Dutch knowledge-economy project MultimediaN • Partners: VU, CWI, UvA, DEN,ICN • People: • Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jos Taekema, Annemiek Teesing, Anna Tordai, Jan Wielemaker, Bob Wielinga • Artchive.com, ICN: Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis)

  25. Use case: painting style Find paintings of a similar style KLIMT, Gustav Portrait of Adele Bloch-Bauer I 1907 Oil and gold on canvas 138 x 138 cm Austrian Gallery, Vienna

  26. How can we find this other ‘Art nouveau’ painting? MUNCH, Edvard The Scream 1893 Oil, tempera and pastel on cardboard 91 x 73.5 cm National Gallery, Oslo

  27. Issues w.r.t. the use case • Parse annotation to find matches with thesauri terms • E.g. match artists to ULAN individuals • Artists-style links • AAT contains styles; ULAN contains artists, but there is no link • Learn link from corpora • Derive it from other annotations • Domain-specific rules/reasoning needed • see example in SWRL doc • Painters may have painted in multiple styles

  28. Example enrichment • Learning relations between art styles in AAT and artists in ULAN through NLP of art0historic texts • But don’t learn things that already exist!

  29. Culture Web demonstratorhttp://e-culture.multimedian.nl

  30. 16 Nov 2006

  31. Perspectives • Basic Semantic Web technology is ready for deployment • in open knowledge-rich domains • Important research issues: scalability, vocabulary alignment, metadata extraction • Web 2.0 features: • Involving community experts in annotation • Personalization, myArt • Social barriers have to be overcome! • “open door” policy • Involvement of general public => issues of “quality” • Importance of using open standards • Away from custom-made flashy web sites

More Related