210 likes | 324 Views
Issues in Reusing and Sharing the Content of Thesauri and Taxonomies in OOR. Marcia Zeng NKOS (Networked Knowledge Organization Systems/Services) My participating in OOR: Introducing the work done by the NKOS folks. About NKOS (Networked Knowledge Organization Systems/Services).
E N D
Issues in Reusing and Sharing the Content of Thesauri and Taxonomies in OOR Marcia Zeng NKOS (Networked Knowledge Organization Systems/Services) My participating in OOR: Introducing the work done by the NKOS folks
About NKOS (Networked Knowledge Organization Systems/Services) -- Informal network for enabling knowledge organization systems (KOS), such as classification systems, thesauri, gazetteers, ontologies and folksonomies as networked interactive information services to support the description and retrieval of diverse information resources through the Internet • Ongoing series of NKOS workshops • JCDL (Joint Conference on Digital Libraries), US • ECDL (European Conference on Research and Advanced Technology for Digital Libraries) • International Conference on Dublin Core and Metadata (DC) • NKOS website http://nkos.slis.kent.edu/
Standards • ANSI/NISO Z39.19 -2005 Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies • Forthcoming: ISO 25964 Structured Vocabularies for Information Retrieval (Based on published BS8723) • Part 1 – Definitions, symbols and abbreviations • Part 2 – Thesauri • Part 3 – Vocabularies other than thesauri • Part 4 – Interoperability between vocabularies • Part 5 – Interoperation between vocabularies and other components of information storage and retrieval systems Leader: Stella Dextre Clarke, Information Consultant, UK
Terminology Registries and Services (1): HILT http://hilt.cdlr.strath.ac.uk/index.htmlDennis Nicholson, University of Strathclyde Large structured vocabularies, each containing thousands of controlled terms/classes and the relationships among terms/classes. Funded by UK Joint Information Systems Committee JISC 4
HILT phase I: mapping between schemes HILT phase II: terminologies server HILT phase III: M2M pilot demonstrator HILT phase IV: transition to service testbed and future requirements study 5
Terminology Registries and Services (2): OCLC Terminology Services Research: http://www.oclc.org/research/projects/termservices/ Service: http://www.oclc.org/terminologies/default.htm OCLC Research Office: Diane Vizine-Goetz (Lead) 6
Using MS Office Research Task Pane, provides 10 vocabularies for tagging, searching, translation, etc. 7
Terminology Registries and Services (3): NSDL Registry http://metadataregistry.org/vocabulary/list.html Funded by NSF NSDL Project Aims: supporting registration of schemes and schemas; supporting the machine mapping of relationships among terms and concepts in those schemes and schemas. U.Washington: Stuart A. Sutton Cornell Univ: Diane Hillmann, Jon Phipps 8
Terminology Registries and Services (4): STAR • Semantic Technologies for Archaeological Resources (2007-2010) • Funded by AHRC (Arts & Humanities Research Council) • Doug Tudhope University of Glamorgan • Aims to develop new methods for linking digital archive databases, vocabularies and the associated grey literature, exploiting the potential of a high level, core ontology and natural language processing techniques. STAR website http://hypermedia.research.glam.ac.uk/kos/star/SKOS terminology services http://hypermedia.research.glam.ac.uk/kos/terminology_services/
Terminology Registries and Services (5): TRSS http://www.ukoln.ac.uk/projects/trss/ Lead Institution UKOLN at the University of Bath Project partner University of Glamorgan, Hypermedia Research Unit and OCLC Office of Research, USA • Analyzes issues related to the potential delivery of a Terminology Registry as a shared infrastructure service within the JISC (UK Joint Information Systems Committee) Information Environment. • Focuses more on KOS registry but would wish to maintain compatibility with more formal AI ontology registries to the extent practical without imposing excessive overheads.
Discussions: Reusing thesauri and taxonomies for ontologies KOS: large, structured vocabularies; good representations for domain knowledge. Basic functional requirement: • eliminating ambiguity • controlling synonyms • establishing relationships among terms where appropriate • testing and validation of terms
Issues (1) 1. Thesauri and taxonomies may have looser control for hierarchical relationships. Terms are selected based on • literary warrant: the natural language used to describe content objects • user warrant: the language of users • organizational warrant: the needs and priorities of the organization -- i.e., not consistently based on logic, not always highly structured.
Issues (2) 2. Common interchange format is SKOS 13 -- i.e., not OWL.
[Borrowing an example from SWED]: SWED is a prototype of an environmental organizations and projectsdirectory Source: http://www.swed.org.uk/swed/index.html 14
[from SWED]: Source: Alistair Miles, Taxonomies and the Semantic Web, CISTRANA Workshop 02/06 http://isegserv.itd.rl.ac.uk/public/skos/press/cistrana200602/taxonomies-semanticweb.ppt
[from SWED]: • cost/benefit tradeoffs involved in investing in semantics … Source: Alistair Miles, Taxonomies and the Semantic Web, CISTRANA Workshop 02/06 http://isegserv.itd.rl.ac.uk/public/skos/press/cistrana200602/taxonomies-semanticweb.ppt
Issues (3) 3. Concept mapping • Current various ontology registries/repositories and terminology services usually do not provide concept-based mapping. • E,g., searching “aging” in a large ontology repository, we got classes like “biological imaging methods”, “Imaging device”, “lavaging”, etc. Note: the actual screen shot is omitted here to protect the reputation of the registry
Issues (4) 4. Multilingual and multi-cultural issues in the mapping process -- non-English schemes -- non-symmetrical schemes • [Based on my experience in building an conceptual framework for Complementary and Alternative Medicine (CAM)]
Summary (1) • Some well-established thesauri and taxonomies should be available for reuse by content developers. • Terminology services may have experienced processes that OOR may encounter. • Issues in reusing and sharing the content of thesauri and taxonomies in OOR include granularity, structure, encoding, etc. These need OOR to have policies, strategies, and enabling tools. • Concept-based mapping will be a major need and will also bring many issues.
Summary (2) • Questions to OOR: • What representation formats will OOR allow? • Will OOR provide access to individual ontology elements? • Does OOR see itself as providing (web) services in addition to providing access to discover and download ontologies?
References • NKOS website http://nkos.slis.kent.edu/ • HILT (High Level Thesaurus) project http://hilt.cdlr.strath.ac.uk/index.html • OCLC Terminologies Service • Research: http://www.oclc.org/research/projects/termservices/ • Service: http://www.oclc.org/terminologies/default.htm • NSDL [vocabulary] Registry http://metadataregistry.org/vocabulary/list.html • STAR (Semantic Technologies for Archaeological Resources) http://hypermedia.research.glam.ac.uk/kos/star/ -- SKOS terminology services http://hypermedia.research.glam.ac.uk/kos/terminology_services/ • TRSS (Terminology Registry Scoping Study) http://www.ukoln.ac.uk/projects/trss/ • Alistair Miles, Taxonomies and the Semantic Web, CISTRANA Workshop 02/06 http://isegserv.itd.rl.ac.uk/public/skos/press/cistrana200602/taxonomies-semanticweb.ppt • JISC state-of-the-art review "Terminology Services and Technologies" http://www.ukoln.ac.uk/terminology/JISC-review2006.html