270 likes | 476 Views
Topic Maps. Major content. Topic Maps. ISO standard Describing knowledge structures and associating them with information resources An enabling technology for knowledge management Providing powerful new ways of navigating large and interconnected corpora. Introduction.
E N D
Topic Maps • ISO standard • Describing knowledge structures and associating them with information resources • An enabling technology for knowledge management • Providing powerful new ways of navigating large and interconnected corpora
Introduction • A book without an index is like a country without a map, and • Compensating the sheer speed of modern commonications: • In the realm of transportation is GPS (Global Positioning System) • In the realm of publishing and information management is Topic Maps • Try to provide efficient solution for indexing of online documents • Back-of-book indexing • Full-text indexing • Topic Maps (indexing for web documents) • Topic Maps is an approach that brings the best of several works: • Traditional indexing • Library science • Knowledge representation
What is Indexing • A map of the knowledge contained in a book • A book may contain multiple indexes: • Index of names • Index of places • Index of subjects Madama Butterfly, 70-71, 234-236, 326 Puccini, Giacomo, 69-71 soprano, 41-42, 337 Tosca, 26, 70, 274-276, 326
Glossaries and thesauri bass: The lowest of the male voice types. Basses usually play priests or fathers in operas, but they occasionally get star turns as the Devil. diva: Literally, “goddess” — a female opera star. Sometimes refers to a fussy, demanding opera star. See also prima donna. first lady: See prima donna. Leitmotif (German, “LIGHT-mo-teef”): A musical theme assigned to a main character or idea of an opera; invented by Richard Wagner. prima donna (“PREE-mah DOAN-na”): Italian for “first lady”. The singer who plays the heroine, the main female character in an opera; or anyone who believes the world revolves around her. soprano: The female voice category with the highest notes and the highest paycheck. Glossary soprano definition: The highest category of female (or artificial male) voice. broader term(s): vocalist, singer narrower term(s): lyric soprano, dramatic soprano, coloratura soprano related term(s): mezzo-soprano, treble Thesarui
Basic Model • TAO • Topics • Associations • Occurences
Topic • A topic in TM can be any thing: • A person, an entity, ..., an object • A topic reified a subject. One-to-one relationship between topics and subjects • Topic types • Any topic can be an instance of zero or more topic types (such as names, works, or places) • Topic types are themselves defined as topics by the standard (explicitly declare name, work, place) • Topics contain: • Names • Occurences • Roles in association
Topic names • A topic may have zero or more names, each of which is considered to be valid within a certain scope. • Normally topics have explicit names. • However, topics do not always have names • Each name may exist in multiple forms: • A name should always have exactly one base name, but can have one or more variant Names • TM provides the facility to assign multiple base names to a single topic and to provide variants of each base for use in specific processing contexts (such as language, domain, geographical area, historical period, etc.) • Base Name • Is the base form of a topic name, it is always a string • Variant Name • Is an alternative form of a base name, that is optimized for a particular computational purpose, such as sorting or display. • Parameters • Are information in the form of a set of topics that expresses the appropriate processing context for a variant name.
Occurrence • A topic can be linked to one or more information resources that are deemed to be relevant to the topic in some way. Such resources are called occurences of the topic. • An occurrence could be: • A monograph • An article • A picture or video • A commentary on the topic • Or any of a host of other forms • Occurence role or occurence role type • Provides typed information (category) for occurrence • Such as monograph, article, illustration, mention, commentary
Association • Describing the relationships between topics by topic association • Such as • “Tosca was written by Puccini” • “Tosca takes place in Rome” • “Puccini was born in Lucca” • “Lucca is in Italy” • “Puccini was influenced by Verdi” • Association types • Types of association • Make it possible to group together the set of topics • written_by, takes_place_in, born_in, is_in (or geographical containment), and influenced_by • Association roles • Each topic that participates in an association plays a role in that association called the association role • E.g., Puccini was born in Lucca”, expressed by the association between Puccini and Lucca, those roles might be “person” and “place”
Subject identity • Topic maps try to achieve a one-to-one mapping between topics and subjects. However, sometimes the same subject is represented by more than one topic, the way to differentiate them is: • Subject identity (when the subject is an addressable information source, its identiy should be established directly through its address.) • Subject indicator (subject descriptors) • When a subject can not be addressed, use indicator to provide a positive, unambiguous indication of the identity of a subject. • A resource can have an address (URI) as its subject identifier.
Facets • Provides a mechanism for assigning property-value pairs to information resources • A facet is a property and its value is called facet values. • E.g., language, security, applicability, user level, online/offline, etc. • Facets can be used for filtering • E.g., those whose language is Italian and user level is secondary school student.
Scope • To specify context (topic has certain meaning in ceratin context) • Scope is defined in terms of themes (a member of the set of topics used to specify a scope) • Scope is helpful to provide different views for navigation for different users
XTM (Topic Map in XML) <topic id="tempest"> <instanceOf><topicRef xlink:href="#play"/></instanceOf> <baseName> <baseNameString>The Tempest</baseNameString> </baseName> <occurrence> <instanceOf><topicRef xlink:href="#plain-text-format"/></instanceOf> <resourceRef xlink:href="ftp://www.gutenberg.org/pub/gutenberg/etext97/1ws4110.txt"/> </occurrence> </topic> <association> <instanceOf><topicRef xlink:href="#written-by"/></instanceOf> <member> <roleSpec><topicRef xlink:href="#author"/></roleSpec> <topicRef xlink:href="#shakespeare"/> </member> <member> <roleSpec><topicRef xlink:href="#work"/></roleSpec> <topicRef xlink:href="#hamlet"/> </member> </association>
Merging • Merging two topic maps • When two topic maps are merged, any topics that the application, determines to have the same subject are merged, and any duplicate associations are removed. • Merging two topics • When two topics are merged, the result is a single topic whose characteristics are the union of the characteristics of the original topics with duplicateds removed. • Two topics are always deemed to have the same subject if • They have one or more subject indictors in common • They reify the same addressable subject, or • They have the same base name in the same scope
XTM Syntax • <topicRef>: Reference to a Topic element • <subjectIndicatorRef>: Reference to a Subject Indicator • <scope>: Reference to Topic(s) that comprise the Scope • <instanceOf>: Points to a Topic representing a class • <topicMap>: Topic Map document element • <topic>: Topic element • <subjectIdentity>: Subject reified by Topic • <baseName>: Base Name of a Topic • <baseNameString>: Base Name String container • <variant>: Alternate forms of Base Name • <variantName>: Container for Variant Name • <parameters>: Processing context for Variant • <association>: Topic Association • <member>: Member in Topic Association • <roleSpec>: Points to a Topic serving as an Association Role • <occurrence>: Resources regarded as an Occurrence • <resourceRef>: Reference to a Resource • <resourceData>: Container for Resource data • <mergeMap>: Merge with another Topic Map
XTM Examples • An XTM topic map instance is an XML document and always starts with <topicMap> and ends with </topicMap> <topicMap> [...] </topicMap>
XTM Examples • Base name and identifier • The id attribute is used internally within the topic map instance document to refer to the topic element. • The base name of a topic is contained within an element called baseNameString, itself contained in an element called baseName. <topic id="t1"> <baseName> <baseNameString>New York</baseNameString> </baseName> </topic> Topic base name is „New York“
XTM Examples • Variant name • In this example, the topic whose base name is New York has a variant name "NYC" used for Wireless devices. The parameters contains the information indicating the context of use of the variant name. <topic id="t1" > <baseName> <baseNameString>New York</baseNameString> <variant> <parameters> <topicRef xlink:href="http://www.topicmaps.org/procs.xtm#psi- display"/> <subjectIndicatorRef xlink:href="http://www.whatever.com/wap"/> </parameters> <variantName> <resourceData>NYC</resourceData> </variantName> </variant> </baseName> </topic>
XTM Examples • Multiple name • In this example, the topic has two base names: New York and New York City. <topic id="t1"> <baseName> <baseNameString>New York</baseNameString> </baseName> <baseName> <baseNameString>New York City</baseNameString> </baseName> </topic>
XTM Examples • Multiple name with scopes • In this example, the topic has two base names with different scopes. <topic id="t1"> <baseName> <scope><topicref xlink:href="http://www.x.com/index.xtm#singular"/></scope> <baseNameString>room</baseNameString> </baseName> <baseName> <scope><topicref xlink:href="http://www.x.com/index.xtm#plural"/></scope> <baseNameString>rooms</baseNameString></baseName> </topic>
XTM Examples • Occurence • In this example, there is a photograph described as an occurrence of the topic "New York". Note that "photograph" is a pointer to a topic presumably explaining what a photograph is. <topic id="t1"> <baseName> <baseNameString>New York</baseNameString> </baseName> <occurrence> <instanceOf> <topicRef xlink:href="#Photograph"/> </instanceOf> <resourceRef xlink:href="doc1#n001"/> </occurrence> </topic>
XTM Examples • Topic types • The topic which has as base names "New York" and "Big Apple" is indicated as having for type a topic identified by the unique identifier "cty" which happens to be the defining topic for the notion of "city". This topic has three different base names ("city", "ville" and "ciudad“). <topic id="t1"> <instanceOf><topicRef xlink:href="#cty"/></instanceOf> <baseName><baseNameString>New York</baseNameString></baseName> <baseName><baseNameString>Big Apple</baseNameString></baseName> </topic> <topic id="cty"> <baseName><baseNameString>city</baseNameString></baseName> <baseName><baseNameString>ville</baseNameString></baseName> <baseName><baseNameString>ciudad</baseNameString></baseName> </topic>
XTM Examples • Association • This example associates the two topics having for names "New York" and "Brooklyn Bridge". The semantics of the association is: "When in New York, visit Brooklyn Bridge. <topic id="t1"> <baseName><baseNameString>New York</baseNameString></baseName> </topic> <topic id="b298"> <baseName><baseNameString>Brooklyn Bridge</baseNameString></baseName></topic> <association> <member> <roleSpec><topicRef xlink:href="#when-in"/></roleSpec> <topicRef xlink:href="#t1"/></member> <member> <roleSpec><topicRef xlink:href="#visit"/></roleSpec> <topicRef xlink:href="#b298"/></member> </association>
References: • http://www.topicmaps.org/xtm/ • Asun Gomez-Perez, Mariano Fernandez-Lopez and Oscar Corcho (2004): Ontology Engineering. Springer