530 likes | 551 Views
Explore the use of taxonomies, thesauri, and topic maps for robust information representation in software engineering and information management. Learn how these semantic models aid in search, visualization, and navigation of data. Discover the power of metadata and context-based representation to organize and classify information resources effectively. Dive into the world of ontology-based information visualization with cluster maps and understand the significance of URI, XML, RDF, and more in the realm of information representation. Uncover the importance of hierarchy, inheritance, similarity, and synonymity in structuring information for optimal use.
E N D
Next-Generation User-Centered Information Management Ontology-based Information Representation Information Ontology Representation Software Engineering betrieblicher Informationssysteme (sebis)Ernst Denert-StiftungslehrstuhlLehrstuhl für Informatik 19 Institut für InformatikTU München wwwmatthes.in.tum.de 030502-Wi-sebis-Master
Ontology-based Information Representation • Outline • Motivation • Semantic Models for Information Representation • Taxonomy • Thesaurus • Topic Map • Ontology • The Semantic Web • URI, XML, RDF, RDFS, OWL • Jena • Ontology-Based Information Visualization withCluster Maps • Conclusion 030502-Wi-sebis-Master
Motivation (1) • Information Representation • Data: information resources described by concepts • Semantic Structure: select, filter, classify, merge... based on terms • Representation: organized information resources • Search for information • Visualize search results • Navigate through search results what how ... Data Semantic Structure Representation 030502-Wi-sebis-Master
Motivation (2) • Metadata • Information about information resources Object-based information representation • Example: Dublin Core • Best-known vocabulary for metadata, a set of 13 properties describing information resource • Document managemen properties: title, creator, publisher, date, language • Semantic properties: subject • Metadata about a document in a simple textfield without restrictions? Context-based information representation • Grouping information resources by subjects they are about Semantic models for information representation 030502-Wi-sebis-Master
Ontology-based Information Representation • Outline • Motivation • Semantic Models for Information Representation • Taxonomy • Thesaurus • Topic Map • Ontology • The Semantic Web • URI, XML, NS, XMLS • RDF, RDFS, OWL • Jena • Ontology-Based InformationVisualization with Cluster Maps • Conclusion 030502-Wi-sebis-Master
Taxonomy (1) • Taxonomy • Biologically motivated: classification of organisms (Carl von Linné) • Classification that arranges terms into a hierarchy • Based on inheritance (is-a relationship) • [ABiilsma] 030502-Wi-sebis-Master
Taxonomy (2) • Taxonomy of Visual Elements • [JHugo] 030502-Wi-sebis-Master
Taxonomy (3) • Person Taxonomy • Child • Adult Person Child Adult Boy Girl Man Woman Baby Student Baby Student Student Pensioneer Student Pensioneer Employee Employee Toddler School-Boy Toddler School-Girl 030502-Wi-sebis-Master
Taxonomy (4) • Properties of Taxonomies • Hierarchy based on inheritance (is-a relationship) • A mammal is an animal. • Grouping of related terms • No explicite definition about how terms relate • Synonyms • Terms with some degree of similarity • Redundancy when a subclass belongs to more than one superclasses • Baby, Toddler and Student appear more than once in the Person taxonomy. 030502-Wi-sebis-Master
Thesaurus (1) • Thesaurus • Motivated by linguistics • Classification of terms based on inheritance, similarity and synonymity • ISO standard: ISO2788 for monolingual and ISO5964 multilingual thesauri • [Creighton] 030502-Wi-sebis-Master
Thesaurus (2) • Example of Thesaurus for „Person“ • Toddler Baby • Student School-Girl • Student School-Boy Child Boy Girl Similarity Synonym Student Baby Student Baby Toddler School-Boy Toddler School-Girl 030502-Wi-sebis-Master
Thesaurus (3) • Properties of Thesauri • Hierarchy based on inheritance (is-a relationship): same as taxonomy • Much reacher vocabulary for describing relationships • Related term: term with similar meaning • USE: with synonyms, preferred term; UF: inverse • Property: scope note • annotation, string attached to the term explaining its meaning • Homonyms (same word, different meaning) not possible to distinguish • Still redundancy when a sublcass belongs to more than one superclasses • Baby, Toddler and Student appear more than once in the taxonomy. 030502-Wi-sebis-Master
Topic Map (1) • Topic Map • Motivated by mathematical models of how long-term memory works • Classification of terms represented by topics based on • Inheritance • Similarity, synonyms • User-defined relationships • XML Topic Maps • Standard XML format for TM Open Vocabulary • www.TopicMaps.org • [TM2] 030502-Wi-sebis-Master
Topic Map (2) • Information resource optionally identified by URI • Hierarchy of concept represented by a topic described by • Name with the properties • Scope – a set of topics representing a context • Type – a set of topics, a kind of an association between topics • Occurances (properties) connect a topic to an information resource; optionally scope and type • Association (Relationship); optionally scope and type • [TM3] 030502-Wi-sebis-Master
Topic Map (3) isSiblingOf • Example of Topic Map for „Person“ • Toddler Baby • Student School-girl • Student School-boy • Name • Age Person isChildOf hasChild Child Adult hasParent Boy Name Age Girl Similarity Synonym Student Baby Student Baby Toddler School-Boy Toddler School-Girl 030502-Wi-sebis-Master
Topic Map (4) • Properties of Topic Maps • Flexible network of concepts strucutured by open vocabulary More powerful (precise) searches Flexible navigation • Composition, association (user-defined relationship types) possible • Able to distinguish between homonyms due to concept‘s type • Name and Age on the same conceptual level as Boy and Girl • Disambiguity of homonyms • Paris (France), Paris (Greek Mythology) • Still redundancy when a sublcass belongs to more than one superclasses • Model in its infancy 030502-Wi-sebis-Master
Ontology (1) • Ontology • Originally motivated by philosophy: „the science of being“ (Aristotle) • Definition: „a formal explicit specification of a shared conceptualization“ (Gruber) • Vocabulary + Structure = Taxonomy • Taxonomy + Relationships, Constraints, Rules = Ontology • „Model for describing the world that consists of • a set of types, • properties, and • a set of relationship types“ (Garshol) • Classification of terms for objects and individuals • Open set of terms • Open language for describing relationships 030502-Wi-sebis-Master
Ontology (2) • Ontology for „Person“ hasChild hasParent Person isSiblingOf isChildOf Adult Child Boy Girl Baby School-Boy School-Girl Name Age Student Rules Toddler A isChildOf B isChildOf C A isGrandChildOf C A isChildOf B B isParentOf A John Big 6 months ... ... A isChildOf B A hasParent B 030502-Wi-sebis-Master
Ontology (3) Properties of Ontologies • Clearly defined relationships (inverse, transitive, symmetrical... ) • Constraints, rules • Open vocabulary Machine-readability Rule-based (logical) inferencing Descriptive power Precise searching, visualization, navigation • Managed redundancy • Easily extensible • Not only meta-model but also instances • Common standard between several parties • Binding data from heterogeneous sources 030502-Wi-sebis-Master
Ontology-based Information Representation • Outline • Motivation • Semantic Models for Information Representation • Taxonomy • Thesaurus • Topic Map • Ontology • The Semantic Web • URI, XML, RDF, RDFS, OWL • Jena • Ontology-Based Information Visualization with Cluster Maps • Conclusion 030502-Wi-sebis-Master
The Semantic Web (1) • Motivation • Extend existing markup with semantic markup • Define a standard web ontology language • Common syntax in order to share semantics • Provide tools and services to help users to • Design and maintain high quality ontologies • Store instances of ontology classes • Query ontology classes and instances • Integrate and align multiple ontologies 030502-Wi-sebis-Master
The Semantic Web (2) • The Semantic Web • A product of W3C (World Wide Web Consortium) headed by Berners-Lee • Goal: lead W3 to its full potential • Develop common protocols • Control evolution of W3 • Maintain interoperability of W3 Semantics and Reasoning Relational Data Data Exchange 030502-Wi-sebis-Master
XML (1) • XML and XML Schema • eXtensible Markup Language • Open vocabulary extensibility • Strict syntax well-formedness • Separation of content different rendering of tree-like documents • XML Schema • Validity • NameSpace • URI that vocabulary is associated with, need not contain a document • Uniform Resource Identifier the set of all addresses that refer to resources • Resource: any object that can be pointed by a URI • URL: subtype of URI Unambiguous interpretation of identifiers 030502-Wi-sebis-Master
RDF (1) • RDF • Resource Description Framework: • Standardization of description of resources • Extensible and flexible hierarchy based on XML • Open vocabulary: classes with properties and relationships • Namespaces: range and domain of properties, need be an existing document • Directed Graph built using statements • Statement specifies properties and values of web resources: John (Object) name (Property) „John Big“ (Value) John (Object) age (Property) „6 months“ (Value) John (Object) isChildOf (Property) Jane (Object) John (Object) isChildOf (Property) Tom (Object) 030502-Wi-sebis-Master
RDF (2) • RDF Document: one description per resource with a list of properties • Description element • may be anonymous (no attributes) • possible attribute for class (object) definion • rdf:about to describe a resource (via URI) or • rdf:ID to define a resource (via a fragment identifier without #) • Fundamental Concepts • Object: resource defined by URI • Property: resource • Value: resource or literal Only fact-stating, basic data model for object, property, value • RDF schema vocabulary (RDF Schema Building Blocks) 030502-Wi-sebis-Master
RDF (3) http://www.family.org/isChildOf http://www.person.bgr/jane http://www.person.bgr/john http://www.family.org/isChildOf http://www.person.bgr/name http://www.person.bgr/tom http://www.person.bgr/age http://purl.org/cd/elements/1.1/creator „6 months“ „John Big“ mailto:tom.big@big.bgr 030502-Wi-sebis-Master
RDF (4) <Description about=„http://www.big.bgr/john“> <person:name resource=„John Big“/> <person:age resource =„6 months“/> < family:isChildOf resource =„http://www.person.org/jane“/> < family:isChildOf resource =„http://www.person.org/tom“/> </Description> <Description about=„http://www.big.bgr“ dc:creator=„tom.big@big.bgr“> </Description> 030502-Wi-sebis-Master
RDFS (1) • Valid RDF • Provides information about interpretation of RDF statements • Class definition • Subclass definition using rdfs:subClassOf • Subproperty definition using rdfs:Property • Domain and Range restrictions • Example for Music use <Music rdf:resource=http://www.music.bgr/> 030502-Wi-sebis-Master
RDFS (2) • <!DOCTYPE rdf:RDF [ <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'> • <!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'> ]> • <rdf:RDF xmlns:rdf="&rdf;" xmlns:rdfs="&rdfs;"> • <rdf:Description rdf:ID="Music"> • <rdf:type rdf:resource="&rdfs;Class"/> </rdf:Description> • <rdf:Description rdf:ID="Symphony"> • <rdf:type rdf:resource="&rdfs;Class"/> • <rdfs:subClassOf rdf:resource="#Music"/> </rdf:Description> • <rdf:Description rdf:ID="Concerto"> • <rdf:type rdf:resource="&rdfs;Class"/> • <rdfs:subClassOf rdf:resource="#Music"/> </rdf:Description> • </rdf:RDF> 030502-Wi-sebis-Master
RDFS (3) • RDFS Weakness to describe resources in sufficient detail • No localized range and domain constraints: the range of hasChild is • person when applied to person • animal when applied to animal • No cardinality constraints: • Person has exactly two parents • No existence constraints: • all instances of person have a mother that is also a person • No transitive, inverse, symmetrical properties: • isChildOf is a transitive property • isChildOf is the inverse of isParentOf • isSiblingOf is symmetrical 030502-Wi-sebis-Master
OWL (1) OWL • Web Ontology Language • General Public Licence • Based on RDF Open vocabulary • Logical combinations of classes (union, interesection, complement) • Extented properties: transitive, symmetrical, inverse • Web Ontology Language Requirements • Easy to understand and use • Formally specified, of adequate expressive power • Providing an automated reasoning support 030502-Wi-sebis-Master
OWL (2) • OWL Types • OWL Full • Greatest expressive power • OWL DL • Extention of DL subset of RDF Well-defined semantics User-friendly syntax • OWL Lite • Simple syntax, tractable inference • [OWL] 030502-Wi-sebis-Master
OWL (3) • Example of Ontology for two books about African Lion 030502-Wi-sebis-Master
OWL (4) • Example of Ontology for „Man“ • <owl:Class rdf:ID="Man"> • <rdfs:subClassOf rdf:resource="#Person"/> • <rdfs:subClassOf rdf:resource="#Adult"/> • <owl:disjointWith rdf:resource="#Woman"/> • </owl:Class> • Example of Ontology for Property „isChildOf“ <owl:ObjectProperty rdf:ID=„isChildOf"> <owl:inverseOf rdf:resource="#isParentOf"/> </owl:ObjectProperty> 030502-Wi-sebis-Master
OWL (4) • Extention towards including instances • Use of OWL and Ontologies • Data integration Ontology mapping • Minimization of intellectual effort involved in developing an ontology by re-use • Composition of ontologies and adoption • Data interchange Jena • Data querying RDQL • Data visualization Cluster Maps 030502-Wi-sebis-Master
Ontology-based Information Representation • Outline • Motivation • Semantic Models for Information Representation • Taxonomy • Thesaurus • Topic Map • Ontology • The Semantic Web • URI, XML, RDF, RDFS, OWL • Jena • Ontology-Based Information Visualization with Cluster Maps • Conclusion 030502-Wi-sebis-Master
Jena (1) • Jena Semantic Web Toolkit (Open Source, HP) • Java framework for writing web application in Java • OWL Lite based on RDF 030502-Wi-sebis-Master
Jena (2) • Jena Architecture • Model Factory creates an empty ontology model that can be added resources, properties, statements Model model = ModelFactory.createDefaultModel(); ModelFactory createDefaultModel:Model 030502-Wi-sebis-Master
Jena (4) Model Model createResource(String) : Resource createProperty(String):Property createStatement(Resource, Property, Object): Statements listStatements(Object, Object, Object) listObjectsOfProperty(Property) • Creation of resources, properties and rules Resource john = model.createResource(familyURI+“john“); Resource jane = model.createResource(familyURI+“jane“); Property childOf = model.createProperty(relationshipURI); Statement statement = model.createStatement(john, childOf, jane); • Querying of a model model.listObjectsOfProperty(childOf); model.listStatements(john,childOf, null); 030502-Wi-sebis-Master
Jena (5) • Addition of properies to subjects john.addProperty(childOf,jane); • Querying of properties john.listProperties(siblingOf); Resource Resource addProperty(Property,Object) listProperties(Property) 030502-Wi-sebis-Master
Jena (6) RDF Data Query Language (RDQL) • Keywords: select, where, using SELECT ?x WHERE (?x, http://www.family.org/child#, „John Big“) ================== http://www.big.bgr/john ================== SELECT ?resource FROM http://www.big.bgr WHERE (?resource info:age ?age) AND ?age >= 2 USING info FOR http://www.big.bgr/peopleInfo# =================== http://www.big.bgr/jane http://www.big.bgr/tom 030502-Wi-sebis-Master
Ontology-based Information Representation • Outline • Motivation • Semantic Models for Information Representation • Taxonomy • Thesaurus • Topic Map • Ontology • The Semantic Web • URI, XML, RDF, RDFS, OWL • Jena • Ontology-Based Information Visualization with Cluster Maps • Conclusion 030502-Wi-sebis-Master
Cluster Maps (1) • Clustering based on similarity • Tasks: • Data Analysis: different ontologies, same dataset • Data comparison: same ontology, multiple data sets • Query relaxation: find result set to queries for which no exact matches exist • Data Analysis: Search on jobs offered by economics sector • Visible size • Differentiation 030502-Wi-sebis-Master
Cluster Maps (2) • Data Analysis: Search on jobs offered by economic sector • Various overlaps 030502-Wi-sebis-Master
Cluster Maps (3) • Data Analysis: Search on jobs offered by region • Visible size • Geographical closeness is preserved 030502-Wi-sebis-Master
Cluster Maps (4) • Data Comparison: services offered by two banks • Same ontology, different data sets 030502-Wi-sebis-Master
Cluster Maps (5) • Query relaxation: query about a holiday in France • colour intensity for the cases • no exact matches • matches based on query relaxation 030502-Wi-sebis-Master
Cluster Maps (6) • Clustering based on similarityforSearch, Navigation, Vizualization • Advantages • Visible and configurable size of the result set • Similarity between the instances of the result set • Intuitive search and navigation process 030502-Wi-sebis-Master
User-centered Information Management! Conclusion Context-dependent Information • Use of Ontologies • [Ont15] Information Visulaization Information Sharing Personalized Information 030502-Wi-sebis-Master
Share your opinion ... • Can we expect maturity in the field of ontology engineering in 5, 10, 15 years from now? • Is there a way to make information find you rather than look for it? • Is XML the best format to build on? How does it influence ontologies today? 030502-Wi-sebis-Master