480 likes | 605 Views
Introduction to Topic Maps (2) Understanding the details. Steve Pepper pepper.steve@gmail.com Oslo University College, 2007-09-22. Week 37 – 09-08 Introduction to Topic Maps – Part 1 Week 38 – 09-15 Creating a topic map Week 39 – 09-22 Introduction to Topic Maps – Part 2
E N D
Introduction to Topic Maps (2)Understanding the details Steve Pepper pepper.steve@gmail.com Oslo University College, 2007-09-22 www.ontopedia.net
Week 37 – 09-08 Introduction to Topic Maps – Part 1 Week 38 – 09-15 Creating a topic map Week 39 – 09-22 Introduction to Topic Maps – Part 2 Week 42 – 10-13 Ontology-driven editing Week 43 – 10-20 The machinery of Topic Maps Week 46 – 11-10 (Semantic Web) Week 48 – 11-24 (Ontologies) Terminology: Topic Maps: The technology and the standard topic maps: The artefacts (documents) we create Course agenda www.ontopedia.net
Today’s agenda • Advanced modeling issues • Types and type hierarchies • Association roles and arity • Variant names and name types • Scope • Identity • Q&A on your personal topic maps www.ontopedia.net
More about topic types www.ontopedia.net
? Topic types • A topic type defines a class of things • It’s a particular kind of category that has instances • You can also think of it as a set of things that haveone or more properties in common • Rule #1: If it doesn’t have instances, it isn’t a type! • “Music” is a category, but not a type (there are no instances) • nothing “is a” music • “Opera” is a type, because there are things which are operas • Tosca “is an” opera • A diagnostic for deciding if ‘foo’ is a type: • If you can think of things which are ‘foos’ the answer is yes • But be careful: Is ‘wine’ a type? • If the answer is no, ask what kind of thing ‘foo’ is • Now, that really is a type! www.ontopedia.net
ISA and type-instance • The relationship between a type and its instance is actually a special kind of association • We call it (guess what): a type-instance relationship • It’s also often called an ISA relationship • It can be represented as an association in XTM or LTM • But there’s no real point • Use the syntactic shortcut instead: • [tosca : opera] tosca is a opera www.ontopedia.net
Rules of thumb for topic types • Choose an appropriate level of generality • “Countries” is better than “Countries in South-East Asia” • The domain of the topic map tells you which countries it includes • If it doesn’t, an association would be a better solution located-in(Thailand, South-East_Asia) • But don’t make it so general as to be useless • “Places” instead of “countries” would mix countries and cities • Keep the name short • That makes it easier to display • Use the singular form • Experience shows this to be most useful, so “Country”, not “Countries” • Use initial capitals • A matter of taste, but I think it looks most tidy www.ontopedia.net
Type hierarchies • Some topic types can be arranged in hierarchies • Type hierarchies are a natural way to order parts of the world • Humans are quite familiar with tree structures • Type hierarchies provide • more user-friendly navigation • more powerful querying/inferencing • more compact schemas and ontologies • greater clarity about the relationships between types • Use hierarchies, but beware of two pitfalls: • Not all hierarchies are type hierarchies... • It’s easy to confuse your ISAs and your AKOs… www.ontopedia.net
Mammal Primate Canine Dog Wolf Chimp Homosapiens Type hierarchies: AKO a dog is AKind Of canine, a canine is AKind Of mammal, etc. www.ontopedia.net
Steve is a homo sapiens A homo sapiens is a mammal Therefore: Steve is a mammal Steve is a homo sapiens Homo sapiens is a species Therefore: *Steve is a species type-instance (ISA) supertype-subtype (AKO) type-instance (ISA) type-instance (ISA) Dragon #1: Mixing ISAs and AKOs ? www.ontopedia.net
Mammal LEGEND supertype-subtype Primate Canine types type-instance Dog Wolf Chimp Homosapiens instances Steve Nils Types, subtypes and instances www.ontopedia.net
C A B ? ? How type hierarchies work • The superclass-subclass relationship has defined semantics • Therefore: make sure you use it correctly • Software (tolog, for example) will assume you mean what you say • If you abuse the semantics you will get incorrect results! • If A is a superclass of B, then • Both A and B must be classes • If C is an instance of B, it must also be an instance of A • If C is a subclass of B, it must also be a subclass of A,(in which case an instance of C is also an instance of Band an instance of A) • If in doubt define your own association type • merging it with superclass/subclass later is trivial www.ontopedia.net
Being both type and instance • Most modelling paradigms distinguish between “type” and “instance” • In most paradigms something cannot be both • In Topic Maps something can be both type and instance • (or class/category and individual) • For example, homo sapiens can be both • a type (supertype=primate, instance=Steve), and • an instance (type=species) • So be careful! www.ontopedia.net
Representing a type hierarchy • Use associations between typing topics • subtypeOf(homo_sapiens : subtype, primate : supertype) • subtypeOf(primate : subtype, mammal : supertype) • XTM 1.0 defined identifiers for these three subjects • subtypeOf (or superclass-subclass):http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass • supertype (or superclass):http://www.topicmaps.org/xtm/1.0/core.xtm#superclass • subtype (or subclass):http://www.topicmaps.org/xtm/1.0/core.xtm#subclass • Topic Maps software understands these and implements the semantics for you www.ontopedia.net
Type hierarchies in LTM /* Techquila hierarchy PSIs */ [hierarchical-relation-type = "Hierarchical relation type" @"http://www.techquila.com/psi/hierarchy/#hierarchical-relation-type"] [superordinate-role-type = "Superordinate role type" @"http://www.techquila.com/psi/hierarchy/#superordinate-role-type"] [subordinate-role-type = "Subordinate role type" @"http://www.techquila.com/psi/hierarchy/#subordinate-role-type"] /* XTM superclass-subclass PSIs */ [subtypeOf : hierarchical-relation-type = "Subtype of”= "Supertype of" / supertype @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass"] [subtype : subordinate-role-type = "Subtype" @"http://www.topicmaps.org/xtm/1.0/core.xtm#subclass"] [supertype : superordinate-role-type = "Supertype" @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass"] /* An example type hierarchy */ subtypeOf( composer : subtype , musician : supertype ) subtypeOf( conductor : subtype , musician : supertype ) subtypeOf( cellist : subtype , musician : supertype ) /* Techquila hierarchy PSIs */ [hierarchical-relation-type = "Hierarchical relation type" @"http://www.techquila.com/psi/hierarchy/#hierarchical-relation-type"] [superordinate-role-type = "Superordinate role type" @"http://www.techquila.com/psi/hierarchy/#superordinate-role-type"] [subordinate-role-type = "Subordinate role type" @"http://www.techquila.com/psi/hierarchy/#subordinate-role-type"] /* XTM superclass-subclass PSIs */ [subtypeOf : hierarchical-relation-type = "Subtype of”= "Supertype of" / supertype @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass"] [subtype : subordinate-role-type = "Subtype" @"http://www.topicmaps.org/xtm/1.0/core.xtm#subclass"] [supertype : superordinate-role-type = "Supertype" @"http://www.topicmaps.org/xtm/1.0/core.xtm#superclass"] www.ontopedia.net
Submarine Europe Music Classical music Norway Engine Popular music Sweden Body Rock music Stockholm Hatch Göteborg Reggae Turret Piston Opera Oslo Choralmusic Bergen Pump Dragon #2: Non-type hierarchies • Not all hierarchies are type hierarchies • For example: • geographical containment • part of relationships • subject classifications • These relationshipsare not supertype-subtype • located in • part of • subtopic of • So again, be careful! An opera is NOT a kind of music... Norway is NOT a kind of Europe... A piston is NOT a kind of submarine... www.ontopedia.net
More about association types – and all about role types www.ontopedia.net
puccini born in lucca birthplace of person place Topics play roles in associations • Associations have no direction • They represent relationships andare inherently multidirectional • “Puccini was born in Lucca” • “Lucca was the birthplace of Puccini” • Two ways to express the same relationship • Impression of direction caused by use of natural language • One of the topics viewed as the subject and the other as the object • Instead of direction, associations use roles • Puccini plays the role of person and Lucca plays the role of place • person and place are association role types (or “role types”, for short) • Labels are assigned based on role perspective www.ontopedia.net
city composer T T Anatomy of an association • Role types characterize the nature of the subject’s involvement in the relationship • They are also topics born-in person place T T T T R A T R Puccini Lucca www.ontopedia.net
[puccini : composer] T T T T ponchielli tosca lucca Role type and topic type • The are NOT the same thing! • Different constructs • Different purposes • Topic type • Expresses something universal or essential about the subject • e.g. Puccini is a composer • Role type • Expresses the nature of the subject’s involvement in a particular relationship • e.g. Puccini plays the role of pupil • Sometimes they are “the same” • More usually they are different person composer pupil born in composed by pupil of www.ontopedia.net
LTM syntax for role types • Complete syntax: born-in( puccini : person, lucca : place ) pupil-of( puccini : pupil, ponchielli : teacher ) composed-by( tosca : work, puccini : composer ) • Abbreviated syntax: [lucca : city] [puccini : composer] born-in( puccini, lucca ) born-in( puccini : composer, lucca : city ) role type is “inherited” from topic type www.ontopedia.net
Symmetric associations • Some associations are the same in both directions • E.g., if A is a friend of B, then B is (presumably) a friend of A • In this case the role type is the same • We call this a symmetric association friend-of friend friend T T T T R A T R puccini mascagni www.ontopedia.net
N-ary associations • Associations can have any number of roles • Two roles is by far the most common • such associations are called binary associations • However, sometimes you need more than two roles • for example,to expressparenthood parenthood (steve : child,edna : mother,harry : father ) parenthood child mother T T T T R A T R steve edna T R father T www.ontopedia.net harry
Unary associations • You can even have associations with just one role player • Unary associations represent yes/no conditions • cf. binary properties (true/false) • e.g. expressing that an opera is unfinished unfinished( turandot : work ) unfinished work T T T R A www.ontopedia.net turandot
The arity of associations: summary • Unary associations are not common • Useful for representing properties that have boolean values • e.g., the property of being “unfinished” • Binary associations are the most common • Often correspond to verb ( subject, object ) constructs • Ternary associations are quite common • Often correspond to verb( subject, direct-object, indirect-object ) constructs • N-ary associations (where n > 3) • Less common but sometimes useful • Many n-ary associations are better represented as (n-1) binary associations... www.ontopedia.net
Rules of thumb for roles • Keep the number of roles as low as possible • Consider whether introducing an intermediate topic makes sense • Avoid repeating roles • If one role can be played multiple times in the same association this indicates that the association represents a group • In these cases, you should probably have a topic for the group T A T T T T A A R R A R A R A R T T T T T T www.ontopedia.net
Naming association types • Nouns • expressing the nature of the relationship, e.g., “first performance” • compounds created from the role names, e.g., “teacher/pupil” • Verbs • very natural, but they imply direction (subject-verb-object) • Steve’s recommendation • use verbs • choose the most natural as the default • ‘composed by’ is more natural than ‘composer of’ • use additional names scoped by role type for the ‘object’ • the corresponding active/passive form is often the best choice www.ontopedia.net
More about names and occurrences www.ontopedia.net
More about names • Names are essentially labels that are used to communicate with humans via a user interface • Different from identifiers used by computers (see “Identity”, below) • Topics can have multiple names • Names may be typed (new in XTM 2.0) • Each name can be scoped • Names can have variants • The question often arises • When is it appropriate to use which? • The answer is by no means clear • The Topic Maps community is still gaining experience in this • The following contains some pointers www.ontopedia.net
Variant names • Variant names are essentially variant forms of the same name • Examples of variants are • sort key • plural form • pronunciation • common misspellings/alternative spellings • transliterations into other scripts (or original forms) www.ontopedia.net
Name types • A name type is a set of names that have something in common • Examples of name types are • first name • last name • country code • language code www.ontopedia.net
Scoped names • Names which are qualified to be used in a certain context • Typical examples: • Names in foreign languages • Names relevant for a certain kind of user(e.g. technical vs. non-technical) www.ontopedia.net
Name type or scoped name? • Rules of thumb: • For names in different natural languages use scope(because language is a kind of context) • If a name of a certain kind is to be found systematically on (almost) every topic of a certain type, use a name type, e.g. • Every person might have a given name and a user name • Every language might have a language code • Your application may leave you no choice! • LTM does not currently support typed names • So you have to use scope • Ontopoly does not currently support scoped names • So you have to use name types www.ontopedia.net
The default name • Another rule of thumb: • Always have exactly one name that is neither typed nor scoped • This is effectively the default name for the topic • Never have more than one name that is both unscoped and untyped • Applications will have no way of consistently choosing one name • In general • Keep names as short as possible • Or at least do not make them longer than necessary www.ontopedia.net
Scope and identity www.ontopedia.net
Scope: A few more details • The context within which a statement is valid • (Statement = name, variant, occurrence or association) • Expressed as a set of (zero or more) topics • Scope with no topics (the default) is called the unconstrained scope • General use is for • Provenance (“where from”) • Opinion (“who says”) • Names • Natural language • Variants • Context of use (e.g. acronym, alternative transliteration) www.ontopedia.net
Identity: The all-important issue • What makes merging possible? • NOT the use of names, which are notoriously unreliable • Names are not unambiguous (the homonym problem) • Many topics have multiple names (the synonym problem) • Achievement of the collocation objective • Only possible through the use of unique global identifiers • The issue of identification of subjects is therefore crucial • If subjects have unique identifiers, people can be free to use whatever names they like – and machines can still aggregate information www.ontopedia.net
A subject in the real world A topic in the computer domain T Subjects and Topics • Topics are surrogates, or “proxies” (inside the computer) for the ineffable subjects that you want to talk about, such as Puccini, love, these slides, or the second law of thermodynamics www.ontopedia.net
Tosca Lucca MadameButterfly Puccini The identity of subjects • Topics exist in order to allow us to talk about subjects • The relationship between the two is sometimes called intentionality • We need to know exactly which subject a topic represents • That is, we need to establish its subject identity • The collocation objective depends on knowing when applications are talking about the same thing www.ontopedia.net
Life, the Universe and Everything subject Giacomo Puccini, Italian composer, b. Lucca 22nd Dec 1858, d. Brussels, 29th Nov 1924. Best known for his operas, of whichTosca is the most . . . The Computer Domain subject identifier subject descriptor http://psi.ontopedia.net/Puccini Puccini topic The Topic Map Domain Subject identifiers • The identity of most subjects can only be established indirectly • An information resource can provide an indication of the subject’s identity to a human • Such a resource is called a subject descriptor • A subject descriptor has an address,even though the subject it indicatesdoes not • Computers can use the address of thesubject descriptor to establish identity • Such addresses are calledsubject identifiers • Subject descriptors and subject identifiers are the two sides ofthe human-computer dichotomy www.ontopedia.net
Published Subjects • In order for identifiers to be reused, they must made publicly available • A subject identifier that has been made available for use outside one particular application is called a published subject identifier (PSI) • Its descriptor is called a published subject descriptor (PSD) • Anyone can publish PSI sets • Adoption of PSI sets will be an evolutionary process based on trust • It will lead to greater and greater interoperability – between topic map applications, between Topic Maps and RDF, and across information and knowledge management in general • Check out http://psi.ontopedia.net (under development) www.ontopedia.net
PSIs for machines and humans www.ontopedia.net
Advice on subject identifiers • Always use them for your typing topics • Makes your ontology more portable • The more serious your application, the more extensively you should use them for instances • Merging with other topic maps will not be successful without identifiers • LTM code for subject identifiers • See previous lecture and opera.ltm • Example: • [composer = "Composer" @"http://psi.ontopedia.net/Composer"] www.ontopedia.net
Steve’s conventions for PSIs • URI prefix: • http://psi.ontopedia.net/ • Note: Not all my identifiers have corresponding descriptors • URI suffix: • Initial cap for topic types and role types (e.g. Composer) • Lower case for association, occurrence and name types (e.g. born_in) • Wikipedia conventions for instances • Replace spaces with underscores • Check Norwegian Opera for examples • Do not use the Italian Opera Topic Map – its conventions are outdated www.ontopedia.net
Wrap Up www.ontopedia.net
Home assignment • Finish your LTM topic map • Read through the slides from this lecture • Consider whether your modelling is appropriate • Consider whether you have followed recommended conventions • Send the final result to pepper.steve@gmail.com byMonday September 29 www.ontopedia.net
Next lecture • Monday October 13 • Same time, same place • Agenda • Ontology-driven editing www.ontopedia.net