1 / 42

Creating Topic Maps + Topic Maps and Knowledge Organization

Creating Topic Maps + Topic Maps and Knowledge Organization. Steve Pepper pepper.steve@gmail.com Oslo University College, 2007-09-15. Week 37 – 09-08 Introduction to Topic Maps – Part 1 Week 38 – 09-15 Creating a topic map Week 39 – 09-22 Introduction to Topic Maps – Part 2

dory
Download Presentation

Creating Topic Maps + Topic Maps and Knowledge Organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating Topic Maps+ Topic Maps and Knowledge Organization Steve Pepper pepper.steve@gmail.com Oslo University College, 2007-09-15 www.ontopedia.net

  2. Week 37 – 09-08 Introduction to Topic Maps – Part 1 Week 38 – 09-15 Creating a topic map Week 39 – 09-22 Introduction to Topic Maps – Part 2 Week 42 – 10-13 Ontology-driven editing Week 43 – 10-20 The machinery of Topic Maps Week 46 – 11-10 (Semantic Web) Week 48 – 11-24 (Ontologies) Terminology: Topic Maps: The technology and the standard topic maps: The artefacts (documents) we create Course agenda www.ontopedia.net

  3. Today’s agenda • Quick recap: basic concepts and building blocks • Topic Maps and Knowledge Organization • Metadata, taxonomies, thesauri, faceted classification • Interchange syntaxes • XTM, LTM and CTM • Demo: Creating a topic map using LTM • Pay close attention... www.ontopedia.net

  4. A pool of information or data, and a knowledge layer consisting of composed by composed by • Topics • a set of topics representing the key subjects of the domain in question Tosca • Associations • representing relationships between subjects Puccini MadameButterfly born in Lucca • Occurrences • links to information that is somehow relevant to a given subject knowledge information Recap: Core concepts • The TAO of Topic Maps www.ontopedia.net

  5. Recap: Basic building blocks • Basic building blocks are • Topics: e.g. “Puccini”, “Lucca”, “Tosca” • Associations: e.g. “Puccini was born in Lucca” • Occurrences: e.g. “http://www.opera.net/puccini/bio.htmlis a biography of Puccini” • Each of these constructs can be typed • Topic types: “composer”, “city”, “opera” • Association types: “born in”, “composed by” • Occurrence types: “biography”, “street map”, “synopsis” www.ontopedia.net

  6. Topic Maps and Knowledge Organization Keywords & controlled vocabularies Taxonomies, thesauri & classifications Indexes & glossaries Ontologies www.ontopedia.net

  7. Work language Author language Title language Edition language Subject language Classification language Index language Document language Production language Carrier language Location language Svenonius, Elaine (2000):The Intellectual Foundation of Information Organization.Cambridge, MA: MIT Press (p.54) Work languages “Work languages describe information entities, their intellectual (as opposed to physical) attributes, and relationships among them.” (p.87) Document languages ”A document is a particular space-time embodiment of information: a document language describes and provides access to this embodiment.” (p.107) Subject languages “A subject language is used to depict what a document is about.” (p.127) Bibliographic languages www.ontopedia.net

  8. Two perspectives • Works have tended to be conflated with documents • So in practice there have been two kinds of language • Document languages • describe the work and its manifestations • document-centric (or resource-centric), e.g. • document metadata (Dublin Core) • bibliographic records (MARC) • Subject languages • describe the subject space in which the work exists • subject-centric, e.g. • thesauri, taxonomies (ICD) • classification schemes (LCSH, DDC) • faceted classification (Colon) www.ontopedia.net

  9. Title: Creating Topic Maps Author: Steve Pepper Date: 2007-09-13 Format: appl/ppt Keywords: topic maps, syntax, knowledge organization Metadata • “Data about data” • Information about documents • e.g. author, title, publisher, date, format, keywords • Useful for managing the content • Especially suitable for librarians • Somewhat useful for searching • Especially for experts • Less useful for end-users • the user starts out wanting to know more about a subject • traditional metadata, however, focuses on the document • if aboutness is provided at all, it gets squeezed into a single field www.ontopedia.net

  10. Keywords • Primitive form of subject-based classification • The keywords are used to describe the subject • Cheap and simple… Folksonomies and tagging. • But also problematic because authors • misspell keywrods, • use different keywords/terms/tags for the same thing, and • use keywords that make no sense • Secondary problem • No way for the user to find out what keywords have been used • A keyword is a topic name www.ontopedia.net

  11. Controlled vocabularies • Solution: create a list of legal keywords! • Requires somewhere to keep the list, and a process for new terms • Benefits • Solves problems of misspelling and duplicates (synonyms) • Disadvantages • Introduces some overhead (a flat list is difficult to manage) • Users can still search using the wrong terms • Users (and authors) still have difficulty finding terms • A controlled vocabulary is a well-defined set of topics with one name per topic www.ontopedia.net

  12. Taxonomies • Organize the keywords into a tree • Most general at the top, more specific further down • Common structure used by Yahoo!, etc. • The folder metaphor • file systems, email, favourites • Requires relationships between terms • Relationships state that one term is more specificthan another • Advantage: terms somewhat easier to find • Disadvantage: real world does not fit neatly into a hierarchy • A taxonomy is a set of topics related through a specific type of hierarchical association www.ontopedia.net

  13. Thesauri • Like a taxonomy, but with some extensions • Also better defined: there are ISO standards for thesauri • Relationship types: • BT Broader term NT Narrower term • USE Preferred term UF Non-preferred terms • RT Related term • SN Scope note • A thesaurus is a set of topics related through particular, predefined association types • BT/NT (hierarchical) and RT (untyped, associative) • (Scope notes are a kind of occurrence) • (USE and UF represent multiple names for the same concept/topic) www.ontopedia.net

  14. Faceted classification • Invented by S. R. Ranganathan in the 1930s • Defines a number of facets or dimensions • Defines a set of terms within each facet • Sometimes these terms are arranged in a taxonomy • Documents are classified against each facet separately • A faceted classification is a collection of topic “hierarchies” • Each “hierarchy” contains topics whose names are used as terms within a particular facet • XFML: An XML interchange syntax for faceted classification inspired by Topic Maps www.ontopedia.net

  15. Expressivity progression open model • Topic maps • use any types, properties, and relationships you like • Faceted classification • multiple vocabularies, taxonomies or thesauri (one per facet) • Thesauri • more formal taxonomy; still no topic types; two association types • Taxonomy • terms arranged in a hierarchy; no topic types; single association type • Controlled vocabulary, folksonomies • just a list of terms; no topic types; no associations fixed model no model www.ontopedia.net

  16. Document-centric approaches • Traditional metadata is document-centric • Provides substantial descriptive power for documents • Allows connection into subject-based classification • Crucial for the management of content • However, users are most interested in the subjects • Taxonomies, thesauri, and faceted classification are also document-centric • These are methods for subject-based classification • They provide hardly any descriptive power for subjects www.ontopedia.net

  17. Subject-centric approaches • Topic maps are subject-centric • They provide great descriptive power for subjects • Good as finding aids, because subjects are what users care about • Documents can be treated as subjects • This enables topic maps to capture metadata as well • It also enables topic maps to stitch metadata and subject-based classification together into one seamless whole • Topic Maps is the knowledge model par excellence: • A subject-centric knowledge model that encompasses every other kind of knowledge organization model • Topic Maps can therefore be used to relate and combine taxonomies, indexes, thesauri, classifications, etc. etc. www.ontopedia.net

  18. Syntaxes XTM, LTM and CTM What are they? When should I use which? www.ontopedia.net

  19. Topic Maps Syntaxes • HyTM (HyTime Topic Maps) • Original syntax, expressed in terms of SGML and HyTime • No longer part of ISO 13250 • XTM (XML Topic Maps Syntax) • Later, XML-based syntax, recently moved to version 2.0 • Easy to understand but very verbose • LTM (Linear Topic Map Notation) • Defined by Ontopia in 2001 and supported by other products • A simple ASCII syntax for rapid prototyping • CTM (Compact Topic Maps Syntax) • ISO standard replacement for LTM • Complete draft exists, but no implementations yet www.ontopedia.net

  20. Topic Map – XTM 1.0 Syntax <!ELEMENT topicMap ( topic | association | mergeMap )* > <!ATTLIST topicMap id ID #IMPLIED xmlns CDATA #FIXED 'http://www.topicmaps.org/xtm/1.0/' xmlns:xlink CDATA #FIXED 'http://www.w3.org/1999/xlink' xml:base CDATA #IMPLIED > <?xml version="1.0" encoding="ISO-8859-1"?> <topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink" > <!-- topics, associations, and mergeMap elements go here --> </topicMap> www.ontopedia.net

  21. Topic Map – LTM Syntax /* topics, associations, and occurrences go here */ www.ontopedia.net

  22. Topic – XTM 1.0 Syntax <!ELEMENT topic ( instanceOf*, subjectIdentity?, ( baseName | occurrence )* ) > <!ATTLIST topic id ID #REQUIRED > <topic id="italy"> ... </topic> <topic id="puccini"> ... </topic> www.ontopedia.net

  23. Topic – LTM Syntax [topic-id] [italy] [puccini] www.ontopedia.net

  24. Topic Name – XTM 1.0 Syntax (1 of 2) <!ELEMENT baseName ( scope?, baseNameString, variant* ) > <!ATTLIST baseName id ID #IMPLIED > <!ELEMENT baseNameString ( #PCDATA ) > <!ATTLIST baseNameString id ID #IMPLIED > <!ELEMENT variant ( parameters, variantName?, variant* ) > <!ATTLIST variant id ID #IMPLIED > <!ELEMENT variantName ( resourceRef | resourceData ) > <!ATTLIST variantName id ID #IMPLIED > www.ontopedia.net

  25. Topic Name – XTM 1.0 Syntax (2 of 2) <topic id="la-boheme"> <baseName> <baseNameString>La Bohème</baseNameString> <variant> <parameters> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/core.xtm#sort"/> </parameters> <variantName> <resourceData>Bohème, La</resourceData> </variantName> </variant> </baseName></topic> www.ontopedia.net

  26. Topic Name – LTM Syntax [topic-id = basename; sortname?; dispname?] [la-boheme = ”La Bohème"; "Bohème, La"] www.ontopedia.net

  27. Topic Type – XTM 1.0 Syntax Use <instanceOf> subelement <topic id="opera"> ... </topic> <topic id="tosca"> <instanceOf> <topicRef xlink:href="#opera"/> </instanceOf> </topic> <topic id="boito"> <instanceOf> <topicRef xlink:href="#composer"/> </instanceOf> <instanceOf> <topicRef xlink:href="#librettist"/> </instanceOf> </topic> www.ontopedia.net

  28. Topic Type – LTM Syntax [topic-id : topic-type] [tosca : opera] [boito : composer librettist] www.ontopedia.net

  29. Occurrence – XTM 1.0 Syntax Use <occurrence> subelement:external/internal resources: <resourceRef> or <resourceData> <!ELEMENT occurrence ( instanceOf?, scope?, ( resourceRef | resourceData ) ) > <!ATTLIST occurrence id ID #IMPLIED > <topic id="la-boheme"> <occurrence> <instanceOf><topicRef xlink:href="#homepage"/></instanceOf> <resourceRef xlink:href="http://www.opera.it/Opere/La-Boheme/La-Boheme.html"/> </occurrence><occurrence> <instanceOf><topicRef xlink:href="#premiere-date"/></instanceOf> <resourceData>1896 (1 Feb)</resourceData> </occurrence></topic> www.ontopedia.net

  30. Occurrence – LTM Syntax {topic-id, occurrence-type, [URL | data]} {la-boheme, homepage,"http://www.opera.it/Opere/La-Boheme/La-Boheme.html"} {la-boheme, premiere-date, [[1896 (1 Feb)]]} www.ontopedia.net

  31. Topic – Complete XTM 1.0 Syntax <topic id="la-boheme"> <instanceOf><topicRef xlink:href="#opera"/></instanceOf> <baseName> <baseNameString>La Bohème</baseNameString> <variant> <parameters> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/core.xtm#sort"/> </parameters> <variantName><resourceData>Boheme, La</resourceData></variantName> </variant> </baseName> <occurrence> <instanceOf><topicRef xlink:href="#homepage"/></instanceOf> <resourceRef xlink:href="http://www.opera.it/Opere/La-Boheme/La-Boheme.html"/> </occurrence> <occurrence> <instanceOf><topicRef xlink:href="#premiere-date"/></instanceOf> <resourceData>1896 (1 Feb)</resourceData> </occurrence> </topic> www.ontopedia.net

  32. Topic – Complete LTM Syntax [la-boheme : opera = "La Bohème"; "Boheme, La” ] {la-boheme, homepage, "http://www.opera.it/Opere/La-Boheme/La-Boheme.html"} {la-boheme, premiere-date, [[1896 (1 Feb)]]} www.ontopedia.net

  33. Association – XTM 1.0 Syntax <!ELEMENT association (instanceOf?, scope? , member+)><!ATTLIST association id ID #REQUIRED><!ELEMENT member (roleSpec?, (topicRef | ...)+) > <!ATTLIST member id ID #IMPLIED><!ELEMENT roleSpec (topicRef | ...) > <association> <instanceOf><topicRef xlink:href="#composed-by"/></instanceOf> <member> <roleSpec><topicRef xlink:href="#composer"/></roleSpec> <topicRef xlink:href="#puccini"/> </member> <member> <roleSpec><topicRef xlink:href="#work"/></roleSpec> <topicRef xlink:href="#tosca"/> </member> </association> www.ontopedia.net

  34. Association – LTM Syntax assoc-type ( role-player, role-player, ... ) composed-by( puccini , tosca ) Note 1: There can be more than two role-players in an association. We’ll talk about that next week. Note 2: The above is an oversimplification due to the fact that we have not yet talked about roletypes. We’ll do that next week. The exact syntax should be as follows: assoc-type ( role-player : role-type, role-player : role-type, ... ) composed-by( puccini : composer, tosca : work ) When omitted, the role type will be assumed to be identical to the type of the role-playing topic. This can be a useful short-hand and we will use it for now, but it is not always what you want... www.ontopedia.net

  35. Subject Identity – XTM 1.0 Syntax <!ELEMENT topic (instanceOf*, subjectIdentity?,...)> <!ELEMENT subjectIdentity (resourceRef?, (topicRef | subjectIndicatorRef)*) > <!– Refer to a resource as subject: --> <topic id="foo"> <subjectIdentity> <resourceRef xlink:href="http://www.ontopia.net"/> </subjectIdentity> <baseName> <baseNameString>The Ontopia Website</baseNameString> </baseName> </topic> <!– Refer to a subject indicator: --> <topic id="bar"> <subjectIdentity> <subjectIndicatorRef xlink:href="http://www.ontopia.net/about.html"/> </subjectIdentity> <baseName> <baseNameString>Ontopia</baseNameString> </baseName> </topic> www.ontopedia.net

  36. Subject Identity – LTM Syntax [topic-id = names %subject-address-URL] [topic-id = names @subject-indicator-URL] /* Refer to a resource as subject: */ [foo = "The Ontopia Website" %"http://www.ontopia.net" ] /* Refer to a subject indicator: */ [bar = "Ontopia" @"http://www.ontopia.net/about.html"] www.ontopedia.net

  37. Scope – XTM 1.0 Syntax <!-- "scope" subelements on baseName, occurrence, and association (also "parameters" on variantName) --> <topic id="composed-by"> <baseName> <baseNameString>composed by</baseNameString> </baseName> <baseName> <scope><topicRef xlink:href="#composer"/></scope> <baseNameString>composer of</baseNameString> </baseName> </topic> <topic id="la-boheme2"> <baseName> <baseNameString>La Bohème (Leoncavallo)</baseNameString> </baseName> <baseName> <scope><topicRef xlink:href="#leoncavallo"/></scope> <baseNameString>La Bohème</baseNameString> </baseName> </topic> www.ontopedia.net

  38. Scope – LTM syntax (name or occurrence or association) / scoping-topic(s) [born-in = "composed by" = "composer of" / composer ] [la-boheme1 = "La Bohème (Puccini)" = "La Bohème" / puccini ] [la-boheme2 = "La Bohème (Leoncavallo)" = "La Bohème" / leoncavallo ] www.ontopedia.net

  39. Demo: Creating a topic map www.ontopedia.net

  40. Home assignment • Prerequisites • You have installed Java and the OKS Samplers • You know the basics of LTM • http://www.ontopia.net/download/ltm.html • Create your first topic map • Decide what domain you want to cover • Write LTM in a text editor (Notepad, TextPad, emacs, ...) • Keep it in its own directory • Copy to .../apache-tomcat/webapps/omnigator/WEB-INF/topicmaps for testing in the Omnigator • Use Reload function www.ontopedia.net

  41. Choose something that really interests you It’s much more fun than something boring! Some ideas: Sport (football, cricket, ...) Culture (music, film, literature, theatre, ...) Study courses Project management Conference website Languages Geography This first topic map is your own personal one The next one will be a group project for term assessment Requirements: Minimum 4 topic types, 4 association types, 4 occurrence types Minimum 10 topics, 20 associations, 10 occurrences Send to pepper.steve@gmail.com by Monday 29 September Your own topic map www.ontopedia.net

  42. Next lecture • Monday September 22 • Same time, same place • Agenda • Advanced features (roles, scope, identity, reification) • Help with home assignment www.ontopedia.net

More Related