1 / 22

Towards a Similarity-Based Identity Assumption Service for Historical Places

Towards a Similarity-Based Identity Assumption Service for Historical Places. Establishing Meaningful Links Krzysztof Janowicz; Muenster Semantic Interoperability Lab (MUSIL). Outline. Motivation Scenario Annotation Theory Further Work.

yazid
Download Presentation

Towards a Similarity-Based Identity Assumption Service for Historical Places

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz; Muenster Semantic Interoperability Lab (MUSIL) Krzysztof Janowicz

  2. Outline • Motivation • Scenario • Annotation • Theory • Further Work Image from: http://de.wikipedia.org/wiki/HMS_Victory (Bleiglass, 1998) Similarity-Based Identity Assumption Service for Historical Places

  3. Motivation • For the cultural heritage community • Incomplete and vague knowledge • Interchange between external sources is necessary to answer complex scientific questions &to clean up local knowledge • Local versus global identifiers • Accessible service-based infrastructure! Similarity-Based Identity Assumption Service for Historical Places

  4. Motivation • For semantic similarity research • Application of similarity in a real world domain • Similarity as part of the identity assumption puzzle • Combination of similarity and classical reasoning • Using a stable upper-level ontology (CIDOC CRM) • Theory of similarity assumptions for historical places Similarity-Based Identity Assumption Service for Historical Places

  5. Motivation • For an identity assumption service • To run queries against multiple sources it has to be made sure that they refer to the same real-world phenomena; just a common language is notenough! • Non unique place names (even within the same area) • Place names refer to cities,rivers, valleys, mountains,… • Misinterpreted place names (e.g. 'Al Wahat‘  Oasis) • Names also refer to varying geopolitical units (e.g. nomads) or prominent (artificial) landmarks (e.g. telegraph stations) • Out-dated place or even country names (e.g. UDSSR)  Gazetteers can only partially solve these problems (From discussions with Dr. Karl-Heinz Lampe; ZFMK) Similarity-Based Identity Assumption Service for Historical Places

  6. Place names: Cabo Trafalgar, Taraf al-Gharb, رأس الطرف الأغر HMS Victory: Which one?! Vice-Admiral Horatio Nelson, 1st Viscount Nelson? Also in a historical source from French perspective? Spatial relation between naval battleground and terrestrial cape, Province Cadiz,..? Temporal relations? Battle of Trafalgar - Scenario • Took place at Cape Trafalgar (Province Cadiz) in 1805 • British victory under the command of Horatio Nelson • HMS Victory was Nelsons flagship • Nelson was shot during the battle and died afterwards  Should be easy to annotate!? Image from: http://en.wikipedia.org/wiki/Horatio_Nelson (painted by Nicholas Pocock) Similarity-Based Identity Assumption Service for Historical Places

  7. From: http://en.wikipedia.org/wiki/Image:Trafalgar_aufstellung.jpg Similarity-Based Identity Assumption Service for Historical Places

  8. Annotation of Historical Knowledge • CIDOC conceptual reference model (CRM) as upper-level ontology for the cultural heritage domain • specifies abstract and interrelated vocabulary instead of concrete definitions such as for kinds of exhibits  heterogeneous domain! • describes historical knowledge by relations between places, events, actor and objects • RDF(S) based representation • ISO Standard (ISO/PRF 21127) Similarity-Based Identity Assumption Service for Historical Places

  9. Annotation Examples (RDF-Triples) • P89F.falls_within(E53.Place(Cape Trafalgar), E53.Place(Province Cádiz)) Subject-Predicate-Object: The place Cape Trafalgarfalls within a place called Province Cádiz • P8F.took_place_at(E7.Activity(Battle of Trafalgar), E53.Place(Cape Trafalgar)) • P117F.occurs_during (E7.Activity(Battle of Trafalgar), E5.Event(Trafalgar Campaign)) • P14F.carried_out_by (E7.Activity(Battle of Trafalgar), E21.Person(Nelson)) • P2F.has_type (E53.Place(Andalusia), E55.Type(regions)) Similarity-Based Identity Assumption Service for Historical Places

  10. Theory • In practice semi-automatic disambiguation via gazetteers and other global authorities (such as for historical figures) is often difficult, expensive and error-prone (especially for subordinate geopolitical units, events, actors,…) • Use the links established via the CIDOC CRM annotation between places, actors, objects and events as additional reference points! Similarity-Based Identity Assumption Service for Historical Places

  11. interpretation interpretation Theory Use thematic information as support for spatiotemporal reference Geoinformation = < x, z > Spatiotemporal Reference Systems Semantic Reference Systems CIDOC CRM + Reasoning + Similarity Mike Goodchild: Geographic Rreality Similarity-Based Identity Assumption Service for Historical Places

  12. Theory: Framework Comparing Place Descriptions • Extract new triples out of existing ones  Spatiotemporal & Subsumption Reasoning • Compute overlap between source and target triples  Semantic Similarity Measurement • Compare remaining labels & identifiers  Syntactic Identifier Matching • How probably compared places correspond  Identity Assumption Similarity-Based Identity Assumption Service for Historical Places

  13. HMS XYZ (1805) ? HMS XYZ (1804) Theory: Reasoning • Entities are described by sets of RDF triples • Inference rules to generate new triples • Make local knowledge explicit! • More comparable information about entities • Example: Spatial & temporal Inference rules • Be careful - names are ambiguous! Similarity-Based Identity Assumption Service for Historical Places

  14. Province Cádiz Province Cádiz Nelson falls within Cape Trafalgar Source: performed Napoleonic Wars sims simp * = sims Province Cádiz overlaps with falls within Cape Trafalgar Target: Nelson died in Theory: Similarity Similarity-Based Identity Assumption Service for Historical Places

  15. Theory: Network Approach to Similarity • For all tuples from the source entity: find equal or similar tuples within the target entity description • Define meaningful notions of similarity for given predicates (relations) • Spatial • Temporal • Thematic • Define meaningful notion of similarity for all objects that are not subjects of other triples themselves (e.g. ADL Feature Types) Similarity-Based Identity Assumption Service for Historical Places

  16. Theory: Neighborhoods & Hierarchies spatial temporal thematic Egenhofer & Al-Taha 1992 Different similarity measures for neighborhoods & hierarchies Similarity-Based Identity Assumption Service for Historical Places

  17. (Getty Thesaurus) ID: 7008751 ID: 7008750 Cape Trafalgar Wrexham Theory: Syntactic Matching • After recursively applying (semantic) similarity measurements, only labels, vague appellations and identifier are left  Requires syntactic matching / measuring (found at: www.gwjokes.com ) Similarity-Based Identity Assumption Service for Historical Places

  18. Theory: Identity Assumptions • Two place descriptions probably refer to the same (real world) place if they are linked via equal or similar relations to equal or similar events, actors, objects, … • Similar position within a network of historical facts • Stepwise applying new restrictions to the set of compared historical places Number of compared tuples is a critical issue! Similarity-Based Identity Assumption Service for Historical Places

  19. Further Work & Evidence • Similarity is only one part of the puzzle! • Other parts: trust, contradictions & consistence,... • Which inference rules may lead to difficulties? • How to handle complementary knowledge? • Connections to Time Map and ECAI • Evidence! Battle of Trafalgar Scenario? • Develop a identity assumption pilot • Combination of similarity measurement with itineraries • Based on real world data from ZFMK, Bonn (biodiversity museum) Similarity-Based Identity Assumption Service for Historical Places

  20. Questions • Thank You! • Special thanks to • Martin Doerr Foundation for Research and Technology - Hellas (FORTH)Institute of Computer Science. Heraklion, Crete, Greece • Karl-Heinz Lampe Zoologisches Forschungsmuseum Alexander Koenig (ZFMK). Bonn, Germany • Any Questions? Similarity-Based Identity Assumption Service for Historical Places

  21. ‘Real World’-Place? From: http://de.wikipedia.org/wiki/Bild:Atlantis_map_kircher.gif Similarity-Based Identity Assumption Service for Historical Places

  22. Gazetteer Feature Types • Gazetteer Feature Types Andalucía ADLG Getty Thesaurus Similarity-Based Identity Assumption Service for Historical Places

More Related