190 likes | 294 Views
Federating Terminology: Can We Avoid Reinventing the Wheel?. Gail Hodge U.S. National Biological Information Infrastructure/ Information International Associates, Inc. CIDOC 2000 Ottawa, Canada August 25, 2000. The Problem.
E N D
Federating Terminology: Can We Avoid Reinventing the Wheel? Gail Hodge U.S. National Biological Information Infrastructure/ Information International Associates, Inc. CIDOC 2000 Ottawa, Canada August 25, 2000
The Problem • Web and digital library projects have caused a boom in the need for controlled subject terminology schemes • Many of these subjects are already covered in existing thesauri, subject heading lists, etc., but often there is not a perfect match • Need to keep the cost of development and maintenance under control • Audiences who are already accustomed to certain controlled schemes in their fields • Partnerships are also valuable
A Possible Solution: Federated Terminologies • A terminology system “built” from a series of distributed terminologies that are linked from a core • May link to the whole remote terminology, to top terms and hierarchies, or to lower level nodes • Display the distributed vocabulary through the same terminology browser so that it appears to be integrated • Ultimately, we would like to also take advantage of these sources as generalized “Terminology Services” on the Internet
What is Needed to Federate Terminologies? • Protocol for communicating between terminologies -- requesting terms, receiving terms and interpreting structure • Metadata that describes terminologies in a way that generalized services can be provided
Protocol • Querying a terminology via the Internet and retrieving terms (a structure) from it • Allow browsers and clients to access sources of which it has no previous knowledge • Components • Identification of the term as a digital object (a URL or more persistent scheme) • Identification of the Scheme • Structure for conveying the query • Structure for returning the terminology structure
Exact Match Queryhttp://localhost/thesauri/Theme?biosphereResponse<DESCRIPTOR ID=“Biosphere” LABEL=“Biosphere” SN=“Organisns and ecosystems”><CAT LABEL=“Earth” <BT LABEL=“Earth” <NT LABEL=“Organisms”
Major Protocol Issues • Many terminologies are not electronic; if electronic they may be in PDF; need restructuring to provide persistent identification • Slow speed of the Internet • Drivers have proven difficult to install by the target sites • Moving from RDF to XML
Metadata Content Standard • Data elements needed to describe a Web-based terminology source for use by any browser/client • Available for review at www.alexandria.ucsb.edu/nkos • Key metadata elements could be contained in metatags or in a registry • Environmental Thesaurus Registry being developed by the NBII as a prototype
Major Metadata Issues • Difficult to create the metadata unless you are the terminology “owner” • Self-registration is going to be necessary • Need for a taxonomy that generalizes the description of terminology resources
Draft Taxonomy of Terminologies • No consistent approach to classifying terminology sources, particularly with a view to how they behave in a networked environment • Draft taxonomy developed (www.alexandria.ucsb.edu/) • Next step is to add to the definitions how each should behave in the networked environment • Comments are welcome
Draft Taxonomy • Term Lists • Authority Files • Glossaries • Gazetteers • Dictionaries • Classification and Categorization • Subject Headings • Classification Schemes, Taxonomies and Categorization Schemes • Relationship Groups • Thesauri • Semantic Networks • Ontologies
Networked Knowledge Organization Sources Working Group • Bring together knowledge organization source developers • Work across the various types of terminologies -- ontologies, thesauri,classification schemes, etc. • Discuss and support best practices related to using these sources in networked environments • Provide a “breeding ground” for collaboration
NKOS, contd. • Ad hoc group of terminology developers - 70 members from 10 countries - Humanities, sciences and social sciences - Annual workshops at ACM Digital Library meetings; Dinner meetings at the American Society for Information Science and Technology Meetings; Web site (UC Santa Barbara), listserv (Coalition for Networked Information); listserv archive (OCLC)
Contact Information • NKOS Web Site - www.alexandria.ucsb.edu/~lhill/nkos • Linda Hill - lhill@alexandria.ucsb.edu • Gail Hodge - gailhodge@aol.com • Joseph Busch - jbusch@metacode.com • National Biological Information Infrastructure www.nbii.gov • Anne Frondorf (Project Mgr.) - anne_frondorf@usgs.gov • Gail Hodge (Vocabulary Mgr.) - gailhodge@aol.com