1.33k likes | 1.97k Views
Ontologies What, Why and How A Tutorial Tim Finin 28 April 2004 325b ITE Building, UMBC. http://ebiquity.umbc.edu/v2.1/resource/html/id/20/. Overview. Three questions about ontologies What are they? Why should we care? How can we build and use them?. Overview -- agenda.
E N D
OntologiesWhat, Why and HowA TutorialTim Finin28 April 2004325b ITE Building, UMBC http://ebiquity.umbc.edu/v2.1/resource/html/id/20/
Overview Three questions about ontologies • What are they? • Why should we care? • How can we build and use them?
Overview -- agenda • Part one: ontologies • Background and history • Kinds of ontologies • Semantic web • Some big ontologies • Ontological engineering • Part two: the semantic web • Introduction • Languages • Tools • Applications • Research frontier • Part three: closing • Part four: demos
What’s an ontology?-- In Philosophy -- • Branch of metaphysics dealing with the nature of being. • An ontology is a theory of what exists • It lets us experience and operate in the world by, as Plato put it, "carving nature at its joints“ • “Ontology” is from the Greek ontos for being and logos for word. • Aristotle offered an ontologywhich included 10 categories(from Sowa, after Brentano) • Successful communicationrequires a shared ontology
Tree ofPorphyry • The oldest knowntree diagram is the3rd century AD work by Greek philosopherPorphyry in commentary on Aristotle. • Substance was identified as the supreme genus or the most general supertype.
Why is this funny? In “The analytical language of John Wilkins”*, Jorge Borges writes about a “certain Chinese encyclopaedia” that has the following categorization of animals: (a) belonging to the emperor, (b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification, (i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from a long way off look like flies. * http://agents.umbc.edu/misc/johnWilkins.html
What’s an ontology?-- In Organized Societies -- • A dictionary is an ontology of sorts. • But, ordinary people seldom need or use a dictionary in everyday life. • Human organizations, like the EPA, do need to develop standards for terms and phrases • These typically give a specialized meaning that is unambiguous, different from and/or narrower than the ordinary interpretation. • These are usually given as a glossary or thesaurus of specialized terms
What’s an ontology?-- in Information Systems -- • An explicit formal specification of how to represent the objects, concepts and other domain entities and relationships among them. • Ontologies provide an abstract conceptualization of information to be represented and a vocabulary of terms to use in the representation. • Interoperability between two systems pretty much requires them to share a common ontology. • Common examples: UML diagrams, Data dictionary, DB schema, API descriptions,
Our Focus • We’re technologists, rather than philosophers or bureaucrats, so our focus is on IT and ontologies. • Making machine understandable ontologies • Exploring how they can be used • Exploring what “machine understandable” means • Supporting other uses of ontologies with IT • Knowledge management for NL ontologies
Top down vs. bottom up • Philosophers build fromthe top down and areinterested in capturingthe most generalconcepts. • Programmers tend towork from the bottomup, supporting a set ofapplications, with a little generality to help reuse and future development. • Ex: CHAT-80 system (Periera and Warren, 1982) which answered NL questions about a geographic database. • Example of a microworld ontology supported NLP, query answering, and generation
B A C TABLE Blocks world • The blocks world is a “microworld” used for NLP, vision, planning. • It consists of a table, a set of blocks or different shapes, sizes and colors and a robot hand. • Some typical domain constraints: • Only one block can be on another block. • Any number of blocks can be on the table. • The hand can only hold one block. • Typical representation: ontable(a) ontable(c) on(b,a) handempty clear(b clear(c)
Ontologies in Computer Science Ontology : A common vocabulary and agreed upon meanings to describe a subject domain. • This is not a profoundly new idea … • Vocabulary specification • Domain theory • Conceptual schema (for a data base) • Class-subclass taxonomy • Object schema
Importance of ontologies in communication • An example of the importance of ontologies is the fate of NASA’s Mars Climate Orbiter • It crashed into Mars on September 23, 1999 • JPL used metric units in the program controlling thrusters & Lockheed-Martin used imperial units. • Instead of establishing an orbit at an altitude of 140km, it did so at 60km, causing it to burn up in the Martian atmosphere. • A richer representation would have avoided this.
139 74.50140 77.60 … … Conceptual Schemas A conceptual schema specifies the intended meaning of concepts used in a data base Data Base: Table: price *stockNo: integer; cost: float Data Base Schema: Auto Product Ontology price(x, y) => (x’, y’) [auto_part(x’) & part_no(x’) = x & retail_price(x’, y’, Value-Inc) & magnitude(y’, US_dollars) = y] Product Ontology Conceptual Schema: Units & Measures Ontology
Implicit vs. Explicit Ontologies • Systems which communicate and work together must share an ontology. • The shared ontology can be implicit or explicit. • Implicit ontology are typically represented only by procedures • Explicit ontologies are (ideally) given a declarative representation in a well defined knowledge representation language.
Conceptualizations, Vocabulariesand Axiomitization • Three important aspects to explicit ontologies • Conceptualization involves the underlying model of the domain in terms of objects, attributes and relations. • Vocabulary involves assigning symbols or terms to refer to those objects, attributes and relations. • Axiomitization involves encoding rules and constraints which capture significant aspects of the domain model. • Two ontologies may • be based on different conceptualizations • be based on the same conceptualization but use different vocabularies • differ in how much they attempt to axiomitize the ontologies
fruit fruit pomme apple citron lemon orange orange fruit lime apple citrus lemon orange pear Simple examples fruit tropical temperate
Ontology languages vary in expressivity Thesauri “narrower term” relation space of current interest Inverse, Disjointness,part of… Frames (properties) Formal is-a Catalog/ID CYC DB Schema UMLS RDF RDFS DAML Wordnet OO OWL IEEE SUO Formal instance General Logical constraints Informal is-a Value Restriction Terms/ glossary ExpressiveOntologies SimpleTaxonomies After Deborah L. McGuinness (Stanford)
Common Ontologies & Theories Domain-Specific Ontologies & Theories Lexicons & Skeleton Ontologies Models of Time Actions & Causality Situations & Contexts WordNet Penman Ontology CYC Upper Ontology Operations Logistics Sensor Management Battlefield Situations Command and Control Shared Library Models of Space Physical Objects Geography & Terrain Basic Representation Concepts: Sets, Sequences, Arrays, Quantities, Probabilities Browse Compare Compose Extend Check Editing Tools ° Ontology Library and Editing Tools • Ontolingua is a language for building, publishing, and sharing ontologies. • A web-based interface to a browser/editor server at http://ontolingua.stanford.edu/ and mirror sites. • Ontologies can betranslated into a number of content languages, including KIF, LOOM, Prolog, CLIPS, etc. • Chimera is a tool for merging existing ontologies
Big Ontologies • There are several large, general ontologies that are freely available. • Some examples are: • Cyc - Original general purpose ontology • OntoSem – a lexical KR system and ontology • WordNet - a large, on-line lexical reference system • World Fact Book -- 5Meg of KIF sentences! • UMLS - NLM’s Unified Medical Language System • SUMO – Standard Upper Merged Ontology
Cyc • CYC is a large KB which has beenunder continual development since ~1985. • The CYC KB is a formalized representation a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday life. • CYC is encoded in the KR language CYCL • The Upper CYC Ontology contains approximately 3,000 terms “capturing the most general concepts of human consensus reality”. • http://www.cyc.com/
openCyc • http://www.opencyc.org/ • 6,000 concepts: an upper ontology for all of human consensus reality. • 60K assertions about the 6K concepts, interrelating, constraining, and in effect (partially) defining them. • A compiled version of the Cyc Inference Engine and the Cyc Knowledge Base Browser. • A specification of CycL, Cyc’s KR the language. • A specification of the Cyc API • Sample programs that demonstrate use of the Cyc API for application development.
OntoSem ontology for Language Understanding UMBC’s OntoSem is a large ontology and KR system for language understanding tasks Browse online at http://ilit.umbc.edu/ Intended to represent meaning of NL text and guide its computation
WordNet • WordNet® is an on-line lexical referencesystem whose design is inspired bypsycholinguistic theories of human lexicalmemory. • English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. • Synsets: {board,plank}{board,committee} • Different relations link the synonym sets (e.g. antonyms, generalizations, etc) • ~140K words • Developed by the Cognitive Science Laboratory at Princeton and available online • Although linguistically motivated, many groups have used it as a general ontology of concepts. • http://www.cogsci.princeton.edu/~wn/
IEEE Standard Upper Ontology • An IEEE standards working group • “This standard will specify an upperontology that will enable computers to utilize it for applications such as data interoperability, information search and retrieval, automated inferencing, and natural language processing. • http://suo.ieee.org/ • See site for documents and archives of mailing list discussions • Two “starter documents” for SUOs: SUMO (http://ontology.teknowledge.com/) and IFF
Ontologies: Things to Read • D. McGuinness, Ontologies come of age, 2003 • J. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Pub. Co., Pacific Grove CA, 2000. • N. Noy, D. McGuinness, Ontology Development 101: A Guide to Creating your First Ontology. 2001. • Lenat and R. Guha, Building Large Knowledge-Based Systems: Representation and Inference in CYC, CACM, pp.82-126, 149-240.
Ontology Conclusions • Shared ontologies are essential for increasing the level of automation (agents, autonomic computing, language understanding, etc.) • Ontology tools and standards are important • Good research has been done and is ready for exploitation • RDF and OWL will get ontologies out of the lab • Small ontologies are in use today • See next section on the semantic web • And large general ontologies are available • Cyc, WFB, WordNet, …
Overview • Introduction • Opening thoughts, Motivation, History • Languages • RDF, RDFS, OWL • Tools • Editors, APIs, reasoners, … • Applications • RSS, FOAF, Web sites, agents, IR, … • On the research frontier • Open problems, current research, … • Closing • Speculations, for more info
“XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria.” -- Philip Wadler, Et tu XML? The fall of the relational empire, VLDB, Rome, September 2001.
“The web has made people smarter. We need to understand how to use it to make machines smarter, too.” -- Michael I. Jordan, paraphrased from a talk at AAAI, July 2002 by Michael Jordan (UC Berkeley)
“The Semantic Web will globalize KR, just as the WWW globalize hypertext” -- Tim Berners-Lee
IMHO • The web is like a universal acid, eating through and consuming everything it touches. • Web principles and technologies are equally good for wireless/pervasive computing. • The semantic web is our first serious attempt to provide semantics for XML sublanguages. • It will provide mechanisms for people and machines (agents, programs, web services) to come together. • In all kinds of networked environments: wired, wireless, ad hoc, wearable, etc.
Origins of the Semantic Web TBL Tim Berners-Lee’s original 1989 WWW proposal described a web of relationships among namedobjects unifying many info. management tasks. Capsule history • Guha’s MCF (~94) • XML+MCF=>RDF (~96) • RDF+OO=>RDFS (~99) • RDFS+KR=>DAML+OIL (00) • W3C’s SW activity (01) • W3C’s OWL (03) http://www.w3.org/History/1989/proposal.html
W3C’s Semantic Web Goals Focus on machine consumption: "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Berners-Lee, Hendler and Lassila, The Semantic Web, Scientific American, 2001
TBL’s semantic web vision “The Semantic Web will globalize KR, just as the WWW globalize hypertext” -- Tim Berners-Lee we arehere
Why is this hard? after Frank van Harmelen and Jim Hendler
What a web page looks like to a machine… And understanding natural language is easier than images! “Webscraping” is mostly done by hand crafted rules or rules generated by supervised learning Either way, the rules can break when the page structure changes. after Frank van Harmelen and Jim Hendler
OK, so HTML isnot helpful Could we tell the machine what the different parts of the text represent? title speaker time location abstract biosketch host after Frank van Harmelen and Jim Hendler
XML to the rescue? XML fans propose creating a XML tag set to use for each application. For talks, we can choose <title>, <speaker>, etc. <title> </title> <speaker> </speaker> <time> </time> <location> </location> <abstract> </abstract> <biosketch> </biosketch> <host> </host> after Frank van Harmelen and Jim Hendler
XML machine accessible meaning But, to your machine, the tags still look like this…. The tag names carry no meaning. XML DTDs and Schemas have little or no semantics. <title> </title> <speaker> </speaker> <time> </time> <location> </location> <abstract> </abstract> <biosketch> </biosketch> <host> </host> after Frank van Harmelen and Jim Hendler
<title> <title> </title> </title> <speaker> <speaker> </speaker> </speaker> <time> <time> </time> </time> <location> <location> </location> </location> <abstract> <abstract> </abstract> </abstract> <biosketch> <biosketch> </biosketch> </biosketch> <host> <host> </host> </host> XML Schema helps XML Schema file XML Schemas provide a simple mechanism to define shared vocabularies. <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> after Frank van Harmelen and Jim Hendler
XML Schema file 1 XML Schema file 42 <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> <title> <title> </title> </title> <speaker> </speaker> <speaker> </speaker> <time> <time> </time> </time> <location> <location> </location> </location> <abstract> <abstract> </abstract> </abstract> <biosketch> <biosketch> </biosketch> </biosketch> <host> <host> </host> </host> But there are many schemas after Frank van Harmelen and Jim Hendler
<title> <title> </title> </title> <speaker> </speaker> <speaker> </speaker> <time> <time> </time> </time> <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> <location> <location> </location> </location> XML Schema file 1 XML Schema file 42 <abstract> <abstract> </abstract> </abstract> <biosketch> <biosketch> </biosketch> </biosketch> <host> <host> </host> </host> There’s no way to relate schema Either manually or automatically.XML Schema is weak on semantics.
XML Ontology 1 XML Ontology 42 <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> An Ontology level is needed XMLOntology256 <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema> • Ontologies add • Structure • Constraints • mappings imports imports = <> We need a way to define ontologies in XML So we can relate them So machines can understand (to some degree) their meaning