1 / 27

Reflections from the FACET Project

Explore the integration of thesaurus into the interface, semantic term expansion, and the need for standard representations. The FACET project focuses on cost/benefit issues, semantic web integration, and pilot terminology services. The system allows for ranking matching items, automatic term suggestions, and semantic browsing tools.

tigner
Download Presentation

Reflections from the FACET Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005

  2. Presentation • FACET Project • Faceted Knowledge Organisation Systems (KOS) • Semantic expansion • Web Demonstrator • Reflections / Current work • Need for standard representations and API • Pilot Terminology Services • KOS and Semantic Web • Cost/Benefit issues

  3. FACET - Faceted Access to Cultural hEritage Terminology FACET - a collaborative project investigating the potential of semantic term expansion in retrieval Aims: • Integration of thesaurus into the interface • Semantic term expansion and matching function taking advantage of facet structure http://www.comp.glam.ac.uk/~FACET/

  4. FACET Collaborators • Research Council Funding: EPSRC 3 years • National Museum of Science and Industry (NMSI): National Railway Museum and Science Museum Collections Database • J. Paul Getty Trust Art and Architecture Thesaurus (AAT) • Museum Documentation Association (MDA) Railway Thesaurus • Canadian Heritage Information Network (CHIN) Advisors

  5. NRM Collection examples of free text object descriptor fields • Chair, London Midland & Scottish Railway, straight wooden back initials carved on back, green leatherette seat. • Chair, Railway Clearing House, Curved back with blue leather inset & blue leather seat. R. C.H. carved on back • Chair, M.S. & L.R., Straight back, blue leather seat with M.S. & L.R. carved across back • Armchair, Pullman, green plush, fringed from Pullman section. • Carver chair, Oak with oval brocade seat. Prince of Wales crest on back from Royal Saloon of 1876 • Armchair, Upholstered in blue maquette with curved, buttoned back & scroll arms. Wooden legs • Occasional table, Oak with drawer, ornately carved. From Royal Saloon of 1876 • Set of 4 chairs, High-backed carver chairs upholstered in floral maquette • Clock, made by Jno Walker, 250 Regent Street. Metal face/Roman numerals. Carved wooden square case. 20"x18"x10"

  6. Semantic Term Expansion Reasoning over thesaurus semantic relationships allows the system to play an active role • Ranking of matching items in a result set • Automatic suggestion of terms to be considered for query • Query reformulation and ‘more like this’ option • Augmented Browsing tools – semantic expansion Underpinning technologies: • Measures of distance over the semantic index space • Matching Function for sets of terms

  7. FACET Prototype • SQLServer database: collections DB and Thesaurus • C++ thesaurus term expansion engine • Dual thesaurus representations • database • in-memory data structure • Visual Basic and Web client interfaces • ‘Find Term’ mapping to terms, alternates, scope notes • Browse hierarchies • Semantic browsing • Query Builder • Ranked results

  8. Faceted Knowledge Organisation Systems Faceted classifications based on primary division into fundamental, high-level categories (facets) Compound descriptors (multi-concept headings) are synthesised by combination of terms from limited number of fundamental facets In constructing AAT, adjectival noun phrases very common: e.g. painted oak furniture “Rather than enumerate the nearly infinite number of object and subject descriptions needed by thesaurus users, the AAT decided to pursue the building blocks of these descriptors in the form of a faceted vocabulary” (Guide to Indexing and Cataloging with the Art & Architecture Thesaurus)

  9. Matching Problem “The major problem lies in developing a system whereby individual parts of subject headings containing multiple AAT terms are broken apart, individually exploded hierarchically, and then reintegrated to answer a query with relevance” (Toni Petersen, AAT Director) Query: mahogany, dark yellow, brocading, Edwardian, armchair Descriptor: oak, light yellow, crests, ovals, brocade, Victorian, Carver chair Potentially extra / missing / partially and non-matching terms

  10. System Architecture

  11. FACET standalone system http://www.comp.glam.ac.uk/~facet/webdemo/ dstudhope@glam.ac.uk

  12. FACET Web Demonstrator • illustrates thesaurus content and semantic expansion in a fairly realistic Web prototype application • Intended more as an exploration of FACET research outcomes as dynamically generated Web components than a general interface but suggestive of possible interface components • Not rely on pre-built static HTML pages - thesaurus content is generated dynamically http://www.comp.glam.ac.uk/~FACET/webdemo/

  13. FACET Web Demonstrator implementation • Browser-based interface (ASP application), using a combination of server-side scripting and compiled components • Persistence of state information between page requests a problematic issue - HTTP protocol is (by design) stateless • Solution adopted for current demonstrator involved small 'scriptlet' interface components to communicate with server without causing a browser to refresh the entire page. • But side effect of introducing some (IE) platform dependence

  14. FACET Web Demonstator

  15. Some lessons learned • Results from FACET show potential of faceted KOS for • Query expansion (ranked results based on semantic closeness) • Semantic expansion as a browsing tool when wishing to use KOS behind the scenes • Web demonstrator first step • Based on custom API • KOS and database on same server (but need not be) • How to generalise these techniques?  need for • Common KOS representations and APIs for general terminology (KOS) services

  16. KOS integration into DL servicesfrom Hill et al Research Agenda (SigCR Workshop 2002) Taxonomy of KOS - KOS types linked to DL service protocols Registries of KOS and KOS-level metadata to represent them RDF/XML KOS representations - customisable Core set of relationship types across all KOS General KOS service protocol from which protocols for specific types of KOS can be derived Robust linking model in which DL entities (collections, objects, and services) can refer to KOS entities (concepts, labels, and relationships) Visualization tools that fully use and display the rich semantics embedded in KOS

  17. Towards Terminology Services • KOS-based services as elements of applications with some form of search/indexing component • Next phase of work looks at common KOS representation formats and API protocols - making content available via programmatic interfaces • Eg SKOS Core (RDF/XML) Schema and SKOS API deliverables of SWAD-Europe Thesaurus Activity - http://www.w3.org/2001/sw/Europe/reports/thes • Experiments with XPATH-based KOS interfaces (using XML and SKOS schemas) promising for relatively small KOS held within the web browser

  18. Pilot KOS Browser Client Web Service • SKOS API designed to provide programmatic access to thesauri and related KOS via the web • Builds on Zthes, ADL Protocols • DREFT demonstration web services server based on SKOS API available(?) at ILRT http://www.w3.org/2001/sw/Europe/reports/thes/dreft/ • Only a subset of SKOS API calls were available at time of work we investigated possibilities with just 2 API calls – pilot SKOS API browsing client demonstrates browsing of online thesaurus (GEMET - GEneral Multilingual Environmental Thesaurus) via web service calls. • Also GEMET thesaurus own work on web service API

  19. Pilot SKOS API Web Service Browser getConcept getAllConceptRelatives show semantically connected concepts but not relationships Navigation history and local cache of retrieved concepts implemented API needs more work but is a basis for web services

  20. Semantic Expansion Service • API should reflect use patterns and include composite calls in addition to returning atomic KOS data elements • Ongoing work - semantic expansion as a service • as an API protocol element would yield • different configurations KOS interface displays by single call • novel interfaces, such as navigation via semantic expansion • Query expansion for various ranked result query services • Term suggestion to assist indexing/annotation • More details: KOS at your Service: Programmatic Access to Knowledge Organisation Systems http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/

  21. Future work - KOS and Semantic Web? • Important to provide a bridge/migration between KOS and Ontologies. KOS can be an element of higher level ontologies and schemas and can help leverage them. Eg utilising SKOS RDF/XML Schemas Eg DELOS JPA semantic interoperability project mapping a thesaurus to CRM Upper Ontology • Ontologies as formal precise definition of relationships can be combined with inference rules and automated systems many useful applications (eg e-Science) where well defined objects and operations but also • Take advantage of existing KOS in Semantic Web Some confusion as to how KOS intended to be used Need for education as to KOS design context/purpose

  22. The ‘ontological ideology’ (Adorno) • Assumption that allocation of instances to categories is unproblematic (in everyday life) • tendency to make invisible the ‘interpretive work’ in assigning objects to concepts, the bending of categories and evolution of the meaning of concepts through use • DL application of concepts to ‘documents’ in indexing/search is also not unproblematic • Related via “aboutness” not clear-cut instance relationship • Indexer - Searcher (and Indexer) variation in concept selection • Use of results based on probable relevance judgements

  23. KOS (intellectual) usually • Designed in order to assist generalised retrieval • Basis of construction is perceived assistance in indexing/ searching/browsing as much as logical properties of attributes • Recognition that the semantic structure is to some extent ‘conventional’ with different possible cognitive viewpoints but that users can be assisted to explore a given structure and make use of it for own purposes

  24. Domain dependent level of precision in concept use Important to take into account how applications will process concepts Current KOS relationships at a useful level of generality for many applications (with some specialisation?) where results are based on probable relevance judgements Eg Thesaurus pragmatic tool includes semantics, domain lexicon (UF/ALTs, Scope Notes) Cost/benefit issues for KOS applications in granularity of relationships and degree of formalisation Role for knowledge-based interactive tools in semantic web old debates on Expert Systems Vs Systems for Experts How to apply KOS?

  25. NKOS Workshop at ECDL 2005on related theme to this workshop • NKOS Workshop – Mapping Knowledge Organisation Systems: User-centred Strategies EDCL2005, September 22nd, Vienna see http://www2.db.dk/nkos2005/ • Selected papers from the NKOS workshop will be considered for forthcoming special issue of journal New Review of Hypermedia and Multimedia along with an open call for papers.

  26. References Binding C., Tudhope D. 2004. KOS at your Service: Programmatic Access to Knowledge Organisation Systems. JoDI 4(4), http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/ FACET Case Study, DigiCult Thematic Issue 6: Resource Discovery Technologies for the Heritage Sector,http://www.digicult.info/pages/Themiss.php [pdf] FACET website. http://www.comp.glam.ac.uk/~FACET/ FACET Web demonstrator http://www.comp.glam.ac.uk/~FACET/webdemo/ FACET Xpath work http://www.comp.glam.ac.uk/~FACET/formats/ Hill et al. 2002. Integration of Knowledge Organization Systems into Digital Library Architectures. ASIST SigCR - http://www.lub.lu.se/SEMKOS/docs/Hill_KOSpaper7-2-final.doc Tudhope D., Binding C., Blocks D., Cunliffe D. 2002. Compound Descriptors in Context: A Matching Function for Classifications and Thesauri. JCDL 2002, 84-93. full paper (pdf)

  27. Contact Information Doug Tudhope School of Computing University of Glamorgan Pontypridd CF37 1DL Wales, UK dstudhope@glam.ac.uk http://www.comp.glam.ac.uk/pages/staff/dstudhope

More Related