1 / 91

SKOS-2-HIVE

Dive deep into characterizing knowledge organization structures through the exploration of thesauri, taxonomy, and ontology. Learn about controlled vocabularies, semantic relationships, and the application of terms.

tedj
Download Presentation

SKOS-2-HIVE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SKOS-2-HIVE GWU workshop

  2. Introductions Hollie White hcwhite1@email.unc.edu Jane Greenberg janeg@email.unc.edu

  3. Morning Session Schedule Introductions Section 1: Characterizing Knowledge Organization Structures Section 2: Thesauri and What They Represent BREAK Section 3: From Thesauri to SKOS Section 4: From SKOS to HIVE Exploring HIVE

  4. Section 1: Characterizing knowledge organization structures

  5. Types of knowledge organization structures From least to most structure • Term lists • Controlled vocabularies • Thesauri • Taxonomy • Ontology

  6. Languages for aboutness Indexing languages: Terminological tools • Thesauri (CV – controlled vocabulary) • Subject headings lists • Authority files for named entities (people, places, structures, organizations) Classification / Classificatory systems Keyword lists Natural language systems (broad interpretation)

  7. Term lists Controlled but semi-unstructured list Term List in practice http://library.lib.asu.edu/search/y

  8. Authority files -standardization of names, subjects and titles for easier identification and interoperability of information Authority Files: http://authorities.loc.gov/

  9. Thesauri • Less-structured and structured thesauri • Lexical semantic relationships • Composed of indexing terms/descriptors • Descriptors - representations of concepts Concepts - Units of meaning

  10. Thesaurus basics • Preferred terms vs. non-preferred terms --ex. dress vs. clothing • Semantic relations between terms --broader, narrower, related • How to apply terms (guidelines, rules) • Scope notes

  11. Common thesaural identifiers • SN Scope Note • Instruction, e.g. don’t invert phrases • USE Use (another term in preference to this one) • UF Used For • BT Broader Term • NT Narrower Term • RT Related Term

  12. Controlled Vocabularies (less structured thesauri also referred to as subject heading lists) • Library of Congress Subject Headings (LCSH) • Sears Subject Headings • Medical Subject Headings (MeSH) http://www.nlm.nih.gov/mesh/MBrowser.html

  13. Thesauri Thesaurus in practice • ERIC • NBII http://thesaurus.nbii.gov/portal/server.pt • NASA thesaurus http://www.sti.nasa.gov/thesfrm1.htm

  14. Taxonomy First used by Carl von Linne (Linneaus) to classify zoology. A grouping of terms representing topics or subject categories. A taxonomy is typically structured so that its terms exhibit hierarchical relationships to one another, between broader and narrower concepts. taxonomy == a subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy (Garshol 2004)

  15. Ontology • In general (in the LIS domain): • a tool to help organize knowledge • a way to convey or represent a class (or classes) of things, and relationships among the class/es. • No exact definition…this comes from the community you are coming from

  16. KOS used in Digital Libraries Looked at 269 online digital libraries and collections KOS used: Locally developed taxonomy (113) LCSH (78) Author list (34) Thesauri (26) Alphabetical listing (20) Geographic arrangement (16) Shiri, A. and Chase-Kruszewski, S. (2009) Knowledge organization systems in North American digital library collections. Program:electronic library and information systems. 43 (2) pp 121-139.

  17. Discussion: Think about your own organization. What type of controlled vocabularies, thesauri, and ontologies does your organization use for everyday work? How do these vocabulary choices help you meet the goals of your institution?

  18. Organizing Knowledge Organization Structures

  19. Hodge’s Types of Knowledge Organization Systems Hodge, G. (2000) Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files.http://www.clir.org/pubs/abstract/pub91abst.html Terms Lists : Authority Files, Glossaries, Gazetteers, Dictionaries Classifications and Categories: Subject Headings, Classification Schemes, Taxonomies, and Categorization Schemes Relationship Lists: Thesauri, Semantic Networks, Ontologies

  20. (McGuinness, D. L. (2003). Ontologies Come of Age. In Fensel, et al, Spinning the Semantic Web. Cambridge, MIT Press), pp. 175. [see also, p. 181 + 189])

  21. Greenberg’s Ontology Continuum Classical view of ILS languages <___|____|_______|______|_____|______|______|_______|________|_____> Simple thesauri/ deeper taxonomies low level full/intricate Key word CV thesauri ontologies ontologies Lists (WordNet)(OWL)

  22. (http://jodi.tamu.edu/Articles/v04/i04/Smith/#section12)

  23. http://www.semantic-conference.com

  24. Section 2: Thesauri and what they represent

  25. Examples of different types of “thesauri” • Cook’s Thesaurus http://www.foodsubs.com/ • BZZURKK! Thesaurus of Champions http://epe.lac-bac.gc.ca/100/200/300/ktaylor/kaboom/bzzurkk.htm • General Multilingual Environmental Thesaurus http://www.eionet.europa.eu/gemet

  26. Common thesaural identifiers • SN Scope Note Instruction, e.g. don’t invert phrases • USE Use (another term in preference to this one) • UF Used For • BT Broader Term • NT Narrower Term • RT Related Term

  27. Syndetic Relationships • Hierarchical • Equivalent • Associative

  28. Hierarchical • Level of generality – both preferred terms • BT (broader term) • Birthday cakes BT Cakes • NT (narrower term) • Cakes NT Birthday cakes …remember inheritance

  29. Equivalent • When two or more terms represent the same concept • One is the preferred term (descriptor), where all the information is collected • The other is the non-preferred and helps the user to find the appropriate term

  30. Equivalent • Non-preferred term USE Preferred term • Biological diversification USE Biodiversity • Preferred term UF (used for) Non-preferred term • Biodiversity UF Biological diversification

  31. Associative • One preferred term is related to another preferred term • Non-hierarchical • “See also” function • In any large thesaurus, a significant number of terms will mean similar things or cover related areas, without necessarily being synonyms or fitting into a defined hierarchy

  32. Associative • Related Terms (RT) can be used to show these links within the thesaurus • Bed RT Bedding • Paint Brushes RT Painting • Vandalism RT Hostility • Programming RT Software

  33. Exercise: Thesauri Building • Montages • Digital photographs • Illustrations • Pictures • Photographic prints • Drawings • Photographs • Daguerreotypes • Negatives

  34. Where to start: • Look at the overall offering • Determine the aboutness • Identify the “root” element or broadest term • Identify groups/categories of information • Start structuring based on the syndetic relations you know • Create hierarchies based on the semantic relations • Use the appropriate identifiers to show the relationships

  35. Section 3: From Thesauri to SKOS

  36. Simple Knowledge Organization Systems Classical view of ILS languages <___|____|_______|______|_____|______|______|_______|_______|______> Simple thesauri/ deeper taxonomies low level full/intricate Key word CV thesauri ontologies ontologies Lists (i.e WordNet) (i.e. OWL) SKOS

  37. Example 1:web view of NBII entry

  38. Descriptive Markup “the markup is used to label parts of the document rather than to provide specific instructions as to how they should be processed. The objective is to decouple the inherent structure of the document from any particular treatment or rendition of it. Such markup is often described as "semantic". --from Wikipedia

  39. Markup Languages “is a system for annotating a text in a way which is syntactically distinguishable from that text.” Using tags: <tag>content to be rendered</tag> Or a keyword in brackets to distinguish texts --from Wikipedia

  40. HTML Hypertext Markup Language --language used to mark up webpages --both descriptive and processing

  41. HTML encoding <!doctype html> <html> <head> <title>Hello HTML</title> </head> <body> <p>Hello World!</p> </body> </html>

  42. <a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_9',null,[['synonym','Heterozygotes']]);">Heterozygotes</a></td><td class="valign”><table><tbody id="result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_14:tbody_element”><tr class="odd"><td class="type">BT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_14:0:j_id_jsp_1679715049_18',null,[['synonym','Genotypes']]);">Genotypes</a></td></tr><tr class="even"><td class="type">NT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_14:1:j_id_jsp_1679715049_18',null,[['synonym','Carriers (genetics)']]);">Carriers (genetics)</a></td></tr><tr class="odd"><td class="type">RT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_14:2:j_id_jsp_1679715049_18',null,[['synonym','Heterozygosity']]);">Heterozygosity</a></td></tr><tr class="even"><td class="type">RT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:0:j_id_jsp_1679715049_14:3:j_id_jsp_1679715049_18',null,[['synonym','Homozygotes']]);">Homozygotes</a></td></tr><tr class="odd"><td class="type">SC</td><td class="synonym">LSC Life Sciences</td></tr></tbody></table></td></tr><tr class="even"><td class="valign"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:1:j_id_jsp_1679715049_9',null,[['synonym','Homozygotes']]);">Homozygotes</a></td><td class="valign”><table><tbody id="result:j_id_jsp_1679715049_7:1:j_id_jsp_1679715049_14:tbody_element”><tr class="odd"><td class="type">BT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:1:j_id_jsp_1679715049_14:0:j_id_jsp_1679715049_18',null,[['synonym','Genotypes']]);">Genotypes</a></td></tr><tr class="even"><td class="type">RT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:1:j_id_jsp_1679715049_14:1:j_id_jsp_1679715049_18',null,[['synonym','Heterozygotes']]);">Heterozygotes</a></td></tr><tr class="odd"><td class="type">RT</td><td class="synonym"><a href="#" onclick="return oamSubmitForm4178('result','result:j_id_jsp_1679715049_7:1:j_id_jsp_1679715049_14:2:j_id_jsp_1679715049_18',null,[['synonym','Homozygosity']]);">Homozygosity</a></td></tr><tr class="even"><td class="type">SC</td><td class="synonym">LSC Life Sciences</td></tr></tbody></table></td></tr>; NBII in HTML

  43. XML Extensible Markup Language --Created by the World Wide Web Consortium (W3C). --Used to mark up documents on the internet or electronic documents. --Users get to describe the tags that are used and define how they are used.

  44. XML encoding

  45. <CONCEPT> <DESCRIPTOR>Zygotes</DESCRIPTOR> <UF>Ookinetes</UF> <BT>Ova</BT> <NT>Oocysts</NT> <RT>Hemizygosity</RT> <RT>Reproduction</RT> <RT>Zygosity</RT> <SC>ASF Aquatic Sciences and Fisheries</SC> <SC>LSC Life Sciences</SC> <STA>Approved</STA> <TYP>Descriptor</TYP> <INP>2007-08-14</INP> <UPD>2007-08-14</UPD> </CONCEPT> NBII in XML

  46. RDF Resource Description Framework “is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadatadata model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats” --from Wikipedia

  47. RDF data model  is similar to Entity-Relationship or Class diagrams, statements about resource in subject-predicate- object expressions called “triples”. subject = resource predicate = traits or aspects of the resource and expresses a relationship between the subject and the object.

  48. The sky has the color blue RDF triple: a subject denoting "the sky“ a predicate denoting "has the color” an object denoting "blue”

  49. OWL Web Ontology Language --knowledge representation language for displaying ontologies working with logic

More Related