210 likes | 377 Views
Languages for aboutness. Indexing languages: Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files for named entities (people, places, structures, organizations) Classification Keyword lists Natural language systems (broad interpretation).
E N D
Languages for aboutness • Indexing languages: • Terminological tools • Thesauri (CV – controlled vocabulary) • Subject headings lists (CV) • Authority files for named entities (people, places, structures, organizations) • Classification • Keyword lists • Natural language systems (broad interpretation)
Subject Analysis • What something is about? • What the content of an object is “about”? • Different methods (Wilson, 1968) • Counting (objective method) • Purposive method • Method appealing to unity • What stands out • Challenges • Non-text
Aboutness: How to do it! • Read the document [Intellectual reading] • look for key features • many indexers mark up the items • rarely have time to read the whole document • Determine aboutness [Conceptual analysis] • Translate aboutness into the vocabulary or scheme you are using • In general: Subject headings: 1-3 headings • Descriptors, 5-8 descriptors • Classification: 1 notation (should it only be one!?).
Features of indexing languages: • Involve rules and require maintenance • Can be generated via automatic, human, or auto-human processes • Different processes generally display different strengths and weaknesses.
Features of indexing languages: • With the exception of a few general domain tools, they are generally domain specific. • MeSH • NASA Thesaurus • Astronomy Thesaurus • ERIC thesaurus http://www.darmstadt.gmd.de/~lutes/thesoecd.html • Concepts (or concept representations) are arranged in a discernable order
Language schema designs • Classified--grouping • Hierarchies and facets MeSH Browser http://www.nlm.nih.gov/mesh/MBrowser.html Art and Architecture (Getty AAT) http://www.getty.edu/research/conducting_research/vocabularies/aat/ • Alphabetical -- horizontal • Verbal/Alphabetical (ordering/filing challenges)
Controlled Vocabulary • A list or a database of subject terms in which each concept has a preferred terms or phrase that will be used to represent it in the retrieval tool; the terms not used have references (syndetic structure), and often scope notes.
Thesaurus (structured thesaurus) • Lexical semantic relationships • Composed of indexing terms/descriptors • Descriptors = representations of concepts • Concepts = Units of meaning (Svenonius)
Thesaurus • Preferred terms • Non-preferred terms • Semantic relations between terms • How to apply terms (guidelines, rules) • Scope notes • Adding terms (How to produce terms that are not listed explicitly in the thesaurus)
Preferred Terms • Control form of the term • Spelling, grammatical form • Theatre / Theater • MLA / Modern language association • Choose preferred term between synonyms • Brain cancer or Brain Neoplasms?
Common thesaural identifiers • SN Scope Note • Instruction, e.g. don’t invert phrases • USE Use (another term in preference to this one) • UF Used For • BT Broader Term • NT Narrower Term • RT Related Term
Semantic Relationships • Hierarchy • Equivalence • Association
Hierarchies of Meaning ‘Beer Glass’ ‘White wine glass’ ‘Glass’ ‘Wine Glass’ ‘Red wine glass’ From: Controlled Vocabularies/ Paul Miller Interoperability Focus UKOLN
Hierarchy • Level of generality – both preferred terms • BT (broader term) • Robins BT Birds • NT (narrower term) • Birds NT Robins • Inheritance, very specific rules
Equivalence • When two or more terms represent the same concept • One is the preferred term (descriptor), where all the information is collected • The other is the non-preferred and helps the user to find the appropriate term
Equivalence • Non-preferred term USE Preferred term • Nuclear Power USE Nuclear Energy • Periodicals USE Serials • Preferred term UF (used for) Non-preferred term • Nuclear Energy UF Nuclear Power • Serials UF Periodicals
Association • One preferred term is related to another preferred term • Non-hierarchical • “See also” function • In any large thesaurus, a significant umber of terms will mean similar things or cover related areas, without necessarily being synonyms or fitting into a defined hierarchy
Association • Related Terms (RT) can be used to show these links within the thesaurus • Bed RT Bedding • Paint Brushes RT Painting • Vandalism RT Hostility • Programming RT Software
Thesauri Guides • National Information Standards Organization. (1993). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z39.19-1993. Bethesda, MD: NISO Press.[SILS reference Z695.N36 1994 or http://www.niso.org/standards/resources/z39-19.pdf] • Aitchison, Jean & Gilchirist, Alan. Thesaurus Construction: A Practical Guide. 3rd ed. London: Aslib, 1997. • Willpower Information Management Consultants http://www.willpower.demon.co.uk/thesprin.htm
Thesauri Directory • Indexing Resources on the WWW • http://www.slais.ubc.ca/resources/indexing/database1.htm • -- explore ASIST Thesaurus • Controlled vocabularies • http://sky.fit.qut.edu.au/~middletm//cont_voc.html • Web Compendium • http://www.darmstadt.gmd.de/~lutes/thesauri.html
Thesauri/Keywords Created according to standards Z39.19 (Ansi) Single termconcepts/postcoordination “Wireless network” & “home computer” “Terrorism” “Attacks” & “United States” More popular in the online environment Lend to recall Lend to multilingual environment Subject Heading Lists Rules and guidelines “Thesaurification” multi-wordconcepts/pre-coordination “Wireless home computer network” $y Terrorism attacks $z United States STRINGS Lend to precision