1 / 53

Principles and pragmatics of a Semantic Culture Web

This web overview explores the principles and pragmatics of a semantic culture, focusing on virtual collections, semantic collection-search demonstrator, metadata and vocabulary representation and enrichment, and knowledge engineering on the Web.

alexj
Download Presentation

Principles and pragmatics of a Semantic Culture Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tearing down walls and Building bridges Principles and pragmatics of a Semantic Culture Web

  2. Overview • Virtual collections and Semantic Web • Semantic collection-search demonstrator • For cultural heritage objects • Metadata & vocabulary representation and enrichment • Principles for knowledge engineering on the Web

  3. Acknowledgements • Part of large Dutch knowledge-economy project MultimediaN • Partners: VU, CWI, UvA, DEN,ICN • People: Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jacco van Ossenbruggen, Guus SchreiberJos Taekema, Annemiek Teesing,Anna Tordai, Jan Wielemaker, Bob Wielinga • Artchive.com, Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis)

  4. Hypothesis • Semantic Web technology is in particular useful in knowledge-rich domains or formulated differently • If we cannot show added value in knowledge-rich domains, then it may have no value at all

  5. The Web: resources and links Web link URL URL

  6. The Semantic Web: typed resources and links Painting “Woman with hat SFMOMA Dublin Core creator ULAN Henri Matisse Web link URL URL

  7. Principle 1: semantic annotation • Description of web objects with “concepts” from a shared vocabulary

  8. Search for objects which are linked via concepts (semantic link) Use the type of semantic link to provide meaningful presentation of the search results Principle 2: semantic search Query “Paris” Paris PartOf Montmartre

  9. The myth of a unified vocabulary • In large virtual collections there are always multiple vocabularies • In multiple languages • Every vocabulary has its own perspective • You can’t just merge them • But you can use vocabularies jointly by defining a limited set of links • “Vocabulary alignment” • It is surprising what you can do with just a few links

  10. AAT style/period Edo (Japanese period) Tokugawa SVCN period Edo SVCN is local in-house ethnology thesaurus AAT is Getty’s Art & Architecture Thesaurus Principle 3: vocabulary alignment “Tokugawa”

  11. A link between two thesauri

  12. Levels of interoperability • Syntactic interoperability • using data formats that you can share • XML family is the preferred option • Semantic interoperability • How to share meaning / concepts • Technology for finding and representing semantic links

  13. Distributed vs. centralized collection data • Minimal requirement: collection object has image URI • Preference for external metadata, accessed through protocol such as OAI • In practice, external metadata access is still cumbersome

  14. http://e-culture.multimedian.nl/demo/search

  15. Search strategies • Basic search: keyword-oriented • Advanced search: • Tweaking default search parameters • Time-related queries • Faceted search • Relation search • How are two URIs related?

  16. Keyword search with semantic clustering • Btree of literals plus Porter stem and metaphone index • Find resources with matching labels • Default resources are “Work”s • Find related resources by one-way graph traversal • owl:inverseOf is used • Threshold used for constraining search • Cluster results (group instances)

  17. Search: WordNet patterns that increase recall without sacrificing precisions

  18. Term disambiguation is key issue in semantic search • Post-query • Sort search results based on different meanings of the search term • Mimics Google-type search • Pre-query • Ask user to disambiguate by displaying list of possible meanings • Interface is more complex, but more search functionality can be offered

  19. Faceted search • Use Dublin Core scheme to formulate complex queries • Navigate through relevant metadata

  20. Faceted search Faceted search

  21. What do you need to do to make your collection part of a Semantic Culture Web? Four activities

  22. From metadata to semantic metadata 1. Make vocabulary interoperable 2. Align metadata schema 3. Enrich metadata 4. Align vocabulary

  23. Activity 1: syntactic vocabulary interoperability • Making vocabularies available in the Web standard RDF • Many organizations already do this • W3C provides the SKOS template to make this almost straightforward • Effort required: at most a few days

  24. Multi-lingual labels for concepts

  25. Semantic relation:broader and narrower • No subclass semantics assumed!

  26. Activity 2: aligning the metadata schema • Specify your collection metadata scheme as a specialization of Dublin Core • With RDF/OWL this is easy/trivial! • Cf. DC Application Profiles

  27. Aligning VRA with Dublin Core • VRA is specialization of Dublin Core for visual resources • VRA properties “material.medium” and “material.support” are specializations of Dublin Core property “format” vra:material.medium rdfs:subPropertyOf dc:fotmat . vra:material.medium rdfs:subPropertyOf dc:format .

  28. Activity 3: enriching the metadata • Extracting additional concepts from an annotation • Matching the string “Paris” to a vocabulary term • Information-extraction techniques exists (and continue to be developed) • Effort required can be up to a few weeks • The more concepts, the better, but no need to be perfect!

  29. Example textual annotation

  30. Resulting semantic annotation (rendered as HTML with RDFa)

  31. Regular HTML HTML with RDFa Resulting RDF statements RDFa: embedding RDF in (X)HTML

  32. Activity 4: aligning the vocabulary • Find semantic links between vocabulary links • Derain (ULAN) related-to Fauve (AAT)) • Automatic techniques exists, but performance varies • Often combination of automatic and manual alignment • Effort strongly dependent on vocabularies • But “a little semantic goes a long way” (Hendler)

  33. Learning alignments • Learning relations between art styles in AAT and artists in ULAN through NLP of art historic texts • “Who are Impressionist painters?”

  34. Extracting additional knowledge from scope notes

  35. Principles for knowledge engineering on the Web

  36. Principle 1: Be modest! • Ontology engineers should refrain from developing their own idiosyncratic ontologies • Instead, they should make the available rich vocabularies, thesauri and databases available in web format • Initially, only add the originally intended semantics

  37. Principle 2: Think large! Doug Lenat "Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing."

  38. Principle 3: Develop and use patterns! • Don’t try to be (too) creative • Ontology engineering should not be an art but a discipline • Patterns play a key role in methodology for ontology engineering • See for example patterns developed by the W3C Semantic Web Best Practices group http://www.w3.org/2001/sw/BestPractices/ • SKOS can also be considered a pattern

  39. Principle 4: Don’t recreate, but enrich and align • Techniques: • Learning ontology relations/mappings • Semantic analysis, e.g. OntoClean • Processing of scope notes in thesauri

  40. Principle 5: Beware of ontologicalover-commitment!

More Related