120 likes | 223 Views
UK-based developments in online thesauri for taxonomic information. Copp, C., Grant, M., Hewzulla, D., Hussey, C., Robinson, J., van Breda, J. & White, R. Why do we need a thesaurus?. UK National Biodiversity Network The Recorder Project BioCASE. Improve indexing
E N D
UK-based developments in online thesauri for taxonomic information Copp, C., Grant, M., Hewzulla, D., Hussey, C., Robinson, J., van Breda, J. & White, R.
Why do we need a thesaurus? UK National Biodiversity Network The Recorder Project BioCASE • Improve indexing • Standardise query terms for better retrieval • Links to synonyms and overlapping terms • Expand or narrow queries • Links to other information systems • Potential to build knowledge-bases • Potential to provide checklists for data entry
Possible links by indexing or data access and reporting software to check for terms not in master thesaurus or get added value data Indexing software Central Metadatabase On-line Thesauri Data access and reporting software Check for and Add Terms Partner Databases Static Lists Copy basic Terms Check for term Agreements to copy terms & updates Supply search terms Maintained Lists Thesaurus Interface Master Thesaurus Submit search terms Wrapper, API or copy of Master? Derive equivalent and related terms Published Standards User Interface Copy terms Other potential products derived from or using Thesaurus Users Sources Management Applications Use The Role of the BioCASE Thesaurus
Single word index of terms in Term Table Term Word Term Term Type Term Language Term Version Fact Term Version Relation Term Version Table includes Synonyms, common language forms and preferred status Term in Item Table includes broader term & narrower term relations List Item Fact Related List Item List Item Hierarchy Related terms in other lists Associates terms with different versions of lists List Item in List Version Linked to all tables Edit Session List Version List List Type The BioCASE Thesaurus Basic logical model
Client Application Thesaurus Tools Thesaurus Thesaurus DB Manager RMI (HTTP tunnel) Thesaurus Broker Thesaurus Server List Importers JDBC Term Lists DBMS The BioCASE Thesaurus API
Single word index of terms in Term Table Word Word in Term Term Language Term Term Type Every term has at least one version A term type maybe linked to one or more hierarchy schemes Term Version Fact Term Version Relation Term Version Table includes Synonyms, common language forms and preferred status Term Version in Concept Tables include broader term & narrower term relations Tables include broader term & narrower term relations Concept Fact Concept (List Item) Related Concept Hierarchy Table holds level terms for specific hierarchies Related terms in other lists Concept Code Concept in Concept Group Version Associates terms with different versions of lists Codes might be used for alphanumeric sorting Concept Group Version Every list has at least one version Concept Group (List) Concept Domain (List Type) A Modified logical model • Principles • Terms fall into various ‘domains’ • Within domains there can be many different lists • Lists may exist in several versions and some are dynamic (always changing) • Lists may be in various languages and may use a wide range of diacritic and other characters • There is no single correct list of terms • Terms frequently have synonyms, variants and language versions both in and between lists. • Terms commonly fall into hierarchies but may also be arranged in more complex ways (networks, ancestor & descendant trees etc.) • Terms can be related to terms in other domains
The logical model can be expressed in many ways We may change the physical model as we learn more or for performance reasons. We use an API (Application Programming Interface) to shield users from underlying complexity and change. ‘The BioCASE/Luxembourg Model’