Reasoning with Expressive Description Logics

Reasoning with Expressive Description Logics Logical Foundations for the Semantic Web Ian Horrocks <horrocks@cs.man.ac.uk> University of Manchester Manchester, UK

Talk Outline • Introduction to Description Logics • The Semantic Web: Killer App for (DL) Reasoning? • Semantic Web Background • Ontology Languages for the Semantic Web • Reasoning with OWL • OileEd Demo (if time) • Description Logic Reasoning • Research Challenges

Summary 1 • DLs are family of object oriented KR formalisms related to frames and Semantic networks • Distinguished by formal semantics and inference services • Semantic Web aims to make web resources accessible to automated processes • Ontologies will play key role by providing vocabulary for semantic markup • OWL is a DL based ontology language designed for the Web • Exploits existing standards: XML, RDF(S) • Adds KR idioms from object oriented and frame systems • W3C recommendation and already widely adopted in e-Science • DL provides formal foundations and reasoning support

Summary 2 • Reasoning is important because • Understanding is closely related to reasoning • Essential for design, maintenance and deployment of ontologies • Reasoning support based on DL systems • Sound and complete reasoning • Highly optimised implementations • Challenges remain • Reasoning with full OWL language • (Convincing) demonstration(s) of scalability • New reasoning tasks • Development of (more) high quality tools and infrastructure

Introduction to Description Logics

What Are Description Logics? • A family of logic based Knowledge Representation formalisms • Descendants of semantic networks and KL-ONE • Describe domain in terms of concepts (classes), roles (relationships) and individuals • Distinguished by: • Formal semantics (typically model theoretic) • Decidable fragments of FOL • Closely related to Propositional Modal & Dynamic Logics • Provision of inference services • Sound and complete decision procedures for key problems • Implemented systems (highly optimised)

DL Architecture Knowledge Base Tbox (schema) Man ´ Human u Male Happy-Father ´ Man u9 has-child Female u … Interface Inference System Abox (data) John : Happy-Father hJohn, Maryi : has-child John: 6 1 has-child

Short History of Description Logics Phase 1: • Incomplete systems (Back, Classic, Loom, . . . ) • Based on structural algorithms Phase 2: • Development of tableau algorithms and complexity results • Tableau-based systems for Pspace logics (e.g., Kris, Crack) • Investigation of optimisation techniques Phase 3: • Tableau algorithms for very expressive DLs • Highly optimised tableau systems for ExpTime logics (e.g., FaCT, DLP, Racer) • Relationship to modal logic and decidable fragments of FOL

Latest Developments Phase 4: • Mature implementations • Mainstream applications and Tools • Databases • Consistency of conceptual schemata (EER, UML etc.) • Schema integration • Query subsumption (w.r.t. a conceptual schema) • Ontologies and Semantic Web,Grid and e-Science • Ontology engineering (design, maintenance, integration) • Reasoning with ontology-based markup (meta-data) • Service description and discovery • Commercial implementations • Cerebra system from Network Inference Ltd

Semantic Web:Killer App for DL Reasoning?

History of the Semantic Web • Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • His vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: • This vision of the Web has become known as the Semantic Web “… a plan for achieving a set of connected applications for data on the Web in such a way as to form a consistent logical web of data …” “… an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation …”

Scientific American, May 2001: • Realising the complete “vision” is too hard for now (probably) • Can make a start by adding semantic annotation to web resources • Already seeing exciting applications of technology in e-Science Beware of the Hype!

Where we are Today: the Syntactic Web • A place where computers do the presentation (easy) and people do the linking and interpreting (hard) • Why not get computers to do more of the hard work?

Hard Work using the Syntactic Web… Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

, e.g., Barn Owl Impossible (?) using the Syntactic Web… • Complex queries involving background knowledge • Find information about “animals that use sonar but are neither bats nor dolphins” • Locating information in data repositories • Travel enquiries • Prices of goods and services • Results of human genome experiments • Finding and using “web services” • Visualise surface interactions between two proteins • Delegating complex tasks to web “agents” • Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English

What is the Problem? • Consider a typical web page: • Markup consists of: • rendering information (e.g., font size and colour) • Hyper-links to related content • Semantic content is accessible to humans, but not (easily) to computers… • Requires (at least) NL understanding

Solution(?): Add “Semantic Markup” • Annotations added to web pages (and other web accessible resources) • “Semantics” given by ontologies • Ontologies provide a vocabulary of terms used in annotations • New terms can be formed by combining existing ones • Meaning (semantics) of such terms is formally specified • Need to agree on a standard web ontology language

Ontology Languagesfor theSemantic Web

RDF and RDFS • RDF stands for Resource Description Framework • It is a W3C candidate recommendation (http://www.w3.org/RDF) • RDF is graphical formalism ( + XML syntax + semantics) • for representing metadata • for describing the semantics of information in a machine- accessible way • RDFS extends RDF with “schema vocabulary”, e.g.: • Class, Property • type, subClassOf, subPropertyOf • range, domain

Subject Property Object ex:subject ex:object ex:property RDF Syntax: Triples _:yyy _:xxx « plain litteral » « lexical »^^datatype Jean-François Baget

« Ian Horrocks » « University of Manchester » ex:name ex:name _:yyy ex:member-of rdf:type rdf:type ex:Person ex:Organisation RDF Syntax: Graphs _:xxx Jean-François Baget

ex:Person ex:Animal rdfs:subClassOf rdf:type ex:John ex:Person rdf:type ex:Animal RDFS • RDFS vocabulary adds constraints on models, e.g.: • 8x,y,z type(x,y) and subClassOf(y,z) )type(x,z)

ex:Person ex:Person rdf:type rdfs:subPropertyOf rdf:type rdfs:subClassOf RDFS • RDFS allows arbitrary use of schema vocabulary • Can be used/abused to say very strange things!

RDF/RDFS Semantics • RDF has “Non-standard” semantics given by RDF Model Theory (MT) • IR, a non-empty set of resources • IS, a mapping from V into IR • IP, a distinguished subset of IR (the properties) • IEXT, a mapping from IP into the powerset of IR£IR • Class interpretation ICEXT induced by IEXT(IS(type)) • ICEXT(C) = {x | (x,C) 2 IEXT(IS(type))} • RDFS adds constraints on models • {(x,y), (y,z)} µ IEXT(IS(subClassOf)) ) (x,z) 2 IEXT(IS(subClassOf))

Problems with RDFS • RDFS too weak to describe resources in sufficient detail • No localised range and domain constraints • Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants • No existence/cardinality constraints • Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents • No transitive, inverse or symmetrical properties • Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical • … • Difficult to provide reasoning support • No “native” reasoners for non-standard semantics • May be possible to reason via FO axiomatisation

From RDF to OWL • Two languages developed by extending (part of) RDF • OIL: developed by group of (largely) European researchers (several from EU OntoKnowledge project) • DAML-ONT: developed by group of (largely) US researchers (in DARPA DAML programme) • Efforts merged to produce DAML+OIL • Development was carried out by “Joint EU/US Committee on Agent Markup Languages” • Extends (“DL subset” of) RDF • DAML+OIL submitted to W3C as basis for standardisation • Web-Ontology (WebOnt) Working Group formed • WebOnt group developed OWL language based on DAML+OIL • OWL language now a W3C Proposed Recommendation

OWL Language • Three species of OWL • OWL full is union of OWL syntax and RDF • OWL DL restricted to FOL fragment (¼ DAML+OIL) • OWL Lite is “simpler” subset of OWL DL • Semantic layering • OWL DL ¼ OWL full within DL fragment • OWL DL based on SHIQDescription Logic • In fact it is equivalent to SHOIN(Dn) DL • OWL DL Benefits from many years of DL research • Well defined semantics • Formal properties well understood (complexity, decidability) • Known reasoning algorithms • Implemented systems (highly optimised)

OWL Class Constructors • XMLS datatypes as well as classes in 8P.C and 9P.C • E.g., 9hasAge.nonNegativeInteger (see work by Zhiming Pan) • Arbitrarily complex nesting of constructors • E.g., Person u8hasChild.Doctor t 9hasChild.Doctor

RDFS Syntax <owl:Class> <owl:intersectionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Person"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:toClass> <owl:unionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Doctor"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:hasClass rdf:resource="#Doctor"/> </owl:Restriction> </owl:unionOf> </owl:toClass> </owl:Restriction> </owl:intersectionOf> </owl:Class> E.g., Person u8hasChild.(Doctor t 9hasChild.Doctor):

OWL Axioms • Axioms (mostly) reducible to inclusion (v) • C´D iff both CvD and DvC • Obvious FOL equivalences • E.g., C´D, x.C(x)$D(x),CvD ,x.C(x)!D(x)

Reasoning with OWL

OWL and Description Logic • OWL DL corresponds to SHOIN(Dn) Description Logic • Provides well defined semantics • Formal properties well understood (complexity, decidability) • Facilitates provision of reasoning services (using DL systems) Why do we want/need reasoning services for the Semantic Web?

Philosophical Reasons • Semantic Web aims at “machine understanding” • Understanding closely related to reasoning • Recognising semantic similarity in spite of syntactic differences • Drawing conclusions that are not explicitly stated

Practical Reasons • Given key role of ontologies in e-Science and Semantic Web, it is essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances • Correct— captured intuitions of domain experts • Minimally redundant— no unintended synonyms • Richly axiomatised— (sufficiently) detailed descriptions • Store (large numbers) of instances of ontology classes, e.g.: • Annotations from web pages (or gene product data) • Answer queries over ontology classes and instances, e.g.: • Find more general/specific classes • Retrieve annotations/pages matching a given description • Integrate and align multiple ontologies

Why Decidable Reasoning? • OWL constructors/axioms restricted so reasoning is decidable • Consistent with Semantic Web's layered architecture • XML provides syntax transport layer • RDF(S) provides basic relational language and simple ontological primitives • OWL provides powerful but still decidable ontology language • Further layers (e.g. SWRL) will extend OWL • Will almost certainly be undecidable • Facilitates provision of reasoning services • “Practical” algorithms for sound and complete reasoning • Several implemented systems • Evidence of empirical tractability

Why Sound & Complete Reasoning? • Important for ontology design • Ontologists need to have complete confidence in reasoner • Otherwise they will cease to trust results • Doubting unexpected results makes reasoner useless • Important for ontology deployment • Many realistic web applications will be agent ↔ agent • No human intervention to spot glitches in reasoning • Incomplete reasoning might be OK in 3-valued system • But “don’t know” typically treated as “no”

Basic Inference Tasks • Knowledge is correct (captures intuitions) • Does C subsume D w.r.t. ontology O? (in every modelI of O, CIµDI ) • Knowledge is minimally redundant (no unintended synonyms) • Is C equivallent to D w.r.t. O? (in every modelI of O, CI = DI ) • Knowledge is meaningful (classes can have instances) • Is C is satisfiable w.r.t. O? (there exists some modelI of O s.t. CI; ) • Querying knowledge • Is x an instance of C w.r.t. O? (in every modelI of O, xI2CI ) • Is hx,yi an instance of R w.r.t. O? (in every modelI of O, (xI,yI) 2RI ) • Above problems can be solved using highly optimised DL reasoners

E.g.: Reasoning Support for Ontology Design

E.g.: Reasoning Support for Instance Retrieval

DL Reasoning: Highly Optimised Implementations • DL reasoning based on tableaux algorithms • Naive implementation → effective non-termination • Modern systems include MANY optimisations • Optimised classification (compute partial ordering) • Enhanced traversal (exploits information from previous tests) • Use structural information to select classification order • Optimised subsumption testing (search for models) • Normalisation and simplification of concepts • Absorption (simplification) of axioms • Dependency directed backtracking • Caching of satisfiability results and (partial) models • Heuristic ordering of propositional and modal expansion • …

Research Challenges • Increased expressive power • Existing DL systems implement (at most) SHIQ • OWL extends SHIQ with datatypes and nominals (SHOIN(Dn)) • Future (undecidable) extensions such as SWRL • Scalability • Very large ontologies • Reasoning with (very large numbers of) individuals • Other reasoning tasks • Querying • Matching • Least common subsumer • ... • Tools and Infrastructure • Support for large scale ontological engineering and deployment

Summary 1 • DLs are family of object oriented KR formalisms related to frames and Semantic networks • Distinguished by formal semantics and inference services • Semantic Web aims to make web resources accessible to automated processes • Ontologies will play key role by providing vocabulary for semantic markup • OWL is a DL based ontology language designed for the Web • Exploits existing standards: XML, RDF(S) • Adds KR idioms from object oriented and frame systems • W3C recommendation and already widely adopted in e-Science • DL provides formal foundations and reasoning support

Summary 2 • Reasoning is important because • Understanding is closely related to reasoning • Essential for design, maintenance and deployment of ontologies • Reasoning support based on DL systems • Sound and complete reasoning • Highly optimised implementations • Challenges remain • Reasoning with full OWL language • (Convincing) demonstration(s) of scalability • New reasoning tasks • Development of (more) high quality tools and infrastructure

Acknowledgements Thanks to the many people who I have worked with, in particular: • Dieter Fensel • Frank van Harmelen • Zhiming Pan • Peter Patel-Schneider • Alan Rector • Uli Sattler

Resources • Slides from this talk • http://www.cs.man.ac.uk/~horrocks/Slides/ICIIP • FaCT system (open source) • http://www.cs.man.ac.uk/FaCT/ • OilEd (open source) • http://oiled.man.ac.uk/ • Protégé • http://protege.stanford.edu/plugins/owl/ • W3C Web-Ontology (WebOnt) working group (OWL) • http://www.w3.org/2001/sw/WebOnt/ • DL Handbook, Cambridge University Press • http://books.cambridge.org/0521781760.htm

Select Bibliography • Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 2003. • Franz Baader, Ian Horrocks, and Ulrike Sattler. Description logics as ontology languages for the semantic web. In Festschrift in honor of Jörg Siekmann, LNAI. Springer, 2003. • I. Horrocks and U. Sattler. Ontology reasoning in the SHOQ(D) description logic. In Proc. of IJCAI 2001. All available from http://www.cs.man.ac.uk/~horrocks/Publications/

Reasoning with Expressive Description Logics