180 likes | 277 Views
Associative and Spatial Relationships in Thesaurus-based Retrieval. Harith Alani 1 , Christopher Jones 2 , Douglas Tudhope 1 1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk 2 Department of Computer Science, University of Cardiff c.b.jones@cs.cf.ac.uk.
E N D
Associative and Spatial Relationships inThesaurus-based Retrieval Harith Alani1, Christopher Jones2, Douglas Tudhope1 1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk 2 Department of Computer Science, University of Cardiff c.b.jones@cs.cf.ac.uk
OASIS - Ontologically Augmented Spatial Information System • Aims: • Investigate terminology systems for thematic and spatial access in digital library applications and in particular: • Investigate retrieval potential of geographical metadata schema consisting of rich place name data, with limited locational information • Explore retrieval potential of reasoning over underlying semantic relationships
Presentation • Overview of OASIS • Spatial query expansion • Semantic distance measures for query expansion. • Role of thesaurus associative relationships. • Conclusions
OASIS datasets • Art & Architecture Thesauri (AAT): thematic descriptors - town, arrow, axe, etc. (J. Paul Getty Trust) • Thesaurus of Geographic Names (TGN): place names, hierarchies and centroid co-ordinates. (J. Paul Getty Trust) • Bartholomew digital map data: place names, co-ordinates, and adjacency relationships • Ontological schema implemented using Semantic Index System (SIS), also used to store the data collection • Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS): dataset of Scottish archaeological sites and objects
TGN Scope Note date language Date Language longitude String language date latitude variant spelling (Preferred Term) area Integer Standard Name Geographical Name Topological (Non Preferred Term) Concept Relationships Alternative Name variant spelling AAT isA date found isA Date meets date made Object Place type overlaps found at Artefact Material partOf made of made at description String TGN & Bartholomew RCAHMS OASIS Schema
AAT Hierarchies Tools & Equipment Weapons & Ammunition weapons Axes Axes <cutting tools> edged weapons (tools) (weapons) staff weapons pollaxes Pulaskis Battle-axes gisarmes tomahawks (weapons) <wood-cutting and halberds finishing tools> throwing axes harpoons BT RT hatchets franciscas Semantic distance
RT Expansion Experiments • Sometimes RTs are used in a very loose way • This causes problem for term expansion over RTs • We developed a set of experimental scenarios for more precise control in RT expansion
Specialisation of RTs • Alternative approach is to make use of a richer set of thesaurus relationships by specialising the main relationships • Allows optional filtering on RT subtypes • In creating the AAT, a subset of RT types was employed by thesaurus editors
AAT RT Codes & Rules 1. Alternate hierarchical relationship • Alternative broader/narrower terms (arrows - edged weapons) 2. Whole/Part relationships (arrows - nocks) 3. Interfacet links (19 subtypes). • Activity - Equipment Needed or Produced (arrows - archery) 4. Distinguished-from links (axes (weapons) - axes (tools)) 5. Conjuncted-term links(arrows - bows(weapons))
AAT Hierarchies Tools & Equipment Weapons & Ammunition weapons Axes Axes <cutting tools> edged weapons (tools) (weapons) staff weapons pollaxes Pulaskis Battle-axes gisarmes tomahawks (weapons) <wood-cutting and halberds finishing tools> throwing axes harpoons BT RT (1) RT (4) hatchets franciscas Semantic distance
Conclusions • We have explored the use of semantic distance measures for spatial and associative (RT) thesaurus relationships:- • distance measures can be used with place name hierarchies, as used in online gazetteers and geographical thesauri, when footprint data is limited • The experimental RT scenarios suggest a potential for specialisation of RTs into different sub-types, and optionally linking RT type to query context • Relationship subclasses used in thesaurus design should be retained for later use in retrieval