1 / 22

OWL Datatypes: Design and Implementation

OWL Datatypes: Design and Implementation. Boris Motik and Ian Horrocks University of Oxford. Contents. Introduction The Datatype System of OWL 2 The Datatypes of OWL 2 A Modular Datatype Checker Conclusion. Problems with Datatypes in OWL 1.

wear
Download Presentation

OWL Datatypes: Design and Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OWL Datatypes:Design and Implementation Boris Motik and Ian HorrocksUniversity of Oxford

  2. Contents • Introduction • The Datatype System of OWL 2 • The Datatypes of OWL 2 • A Modular Datatype Checker • Conclusion

  3. Problems with Datatypes in OWL 1 • Datatypes of OWL 1 are based on XML Schema (XSD) • Problems with OWL 1 datatypes: • too few normative ones • no user-defined datatypes (e.g., intervals) • reasoning with some XSD datatypes is difficult • some XSD datatypes have an inappropriate semantics • there are datatype-less constants • certain semantic aspects are unclear • reasoning algorithms are unclear

  4. Motivation • OWL 2: a new version of OWL • considerably improves the datatype system of OWL • Our results ensure that… • …the datatype system of OWL 2 is extensible • …certain language extensions are correctly defined • …OWL 2 supports datatypes that are practically feasible • …we know how to implement the datatypes of OWL 2  Make datatypes in OWL 2 better  Provide guidance for implementors

  5. Contents • Introduction • The Datatype System of OWL 2 • The Datatypes of OWL 2 • A Modular Datatype Checker • Conclusion

  6. Datatype Map • Each datatype d is described by: • a URI – gives the name of the datatype • a set of constantsNC(d) • a set of facets pairs NF(d) • a value space(d)D • a data value(c)D2(d)D for each constant c • a facet value(f)Dµ(d)D for each facet f • Example: real • facets: <x, >x, ·x, ¸x,int • Example: str • facets: h minLength n i, h maxLength n i, h length n i,h pattern “regExp” i

  7. Data Ranges • Facet expression: Boolean formula over facets • e.g., ¸5Æ·10 • Datatype restriction: d[] • d is a datatype and  is a facet expression for d • e.g., real[ int Ƹ5Æ·10 ] • OWL 2 Syntax: DatatypeRestriction( xsd:integer xsd:minInclusive “5”^^xsd:integer xsd:maxInclusive “10”^^xsd:integer ) • Data range: >D, d[], { v1, …, vn }, dr • will be extended in OWL 2 to all Boolean connectives

  8. Using Data Ranges in Restrictions • New datatype constructs: • qualified number restrictions • disjoint data properties • Semantics is defined w.r.t. a datatype domainMD

  9. Openness of the Datatype Domain • MD is usually fixed in DL reasoning • datatype groups: MD is exactly the union of all value spaces • Problem: adding new datatypes can change the meaning of certain axioms • Example: >v8 U.<5t9 U.real • if real is the only datatype, then this axiom is a tautology • if we have both real and str, it is not a tautology  We do not fix MD in OWL 2 • an ontology is satisfiable iff MDexists that at least contains the value spaces of all datatypes and for which all axioms are satisfied • Proposition: consequences of OWL 2 ontologies are independent of the supported set of datatypes

  10. Naming Data Ranges • Teens´real[ intÆ >12Æ <20 ] • semantics: (Teens)D = (real[intÆ >12Æ <20 ])D • use Teens as a shortcut • e.g., Teenager´9hasAge.Teens • Problem: we can write axioms about datatypes • A´real and A´>D • fixes MD to (real)D  prevents us from extending the set of datatypes  Make such axioms acyclic • each data range name can be defined only once and its definition cannot refer to itself  allows for simple unfolding of data range names

  11. Datatype Reasoning • Datatype checker decides satisfiability of conjunctions over assertions dr(t) and t1¼t2 • t(i) is a variable or a constant • example: { 5 }(x1) Æint[ >4Æ <6 ](x2) Æ x1¼ x2 • Datatype checker can be integrated with a (hyper)tableau algorithm as usual • Proposition: datatype checking is NP-hard • uses data property disjointness • seems like an innocuous feature!  even small additions to the language add complexity

  12. Contents • Introduction • The Datatype System of OWL 2 • The Datatypes of OWL 2 • A Modular Datatype Checker • Conclusion

  13. Numeric Datatypes • The following ontology is unsatisfiable: • >v8hasWeight.xsd:double • hasWeight(Paul, “76”^^xsd:integer)  in XSD, the integer 76 is not contained in xsd:double  no notion of typecasts in OWL • XML Schema does not have real numbers  OWL 2 redefines XSD numeric datatypes • owl:realPlus = owl:real [ { -0, +inf, -inf, NaN } • owl:real is the set of all real numbers • all XSD numeric datatypes are subsets of owl:real • facets: • minExclusive, maxExclusive, minInclusive, maxInclusive

  14. String Datatypes • Plain RDF literals with a language tag do not belong to any XSD datatype • “datatype”@en vs. “Datentyp”@de  OWL 2 uses a new rdf:text datatype • value space contains pairs h string, languageTag i • will be used in RIF as well  xsd:string was retrofitted to rdf:text • value space contains pairs h string, “” i  The set of characters is assume to be infinite • E.g., ¸ n U.(str[ length 1])(a) is satisfiable iff n · m, where m is the number characters • m will change in future, which could change the meaning of this axiom

  15. Other Datatypes • Date/time: • many XSD date/time datatypes are difficult to reason with • e.g., xsd:gMonthDay represents a recurring point in time but recurrences are irregular due to leap seconds and years • XSD supports dates without time zones  OWL 2 supports only xsd:dateTime with required time zone • facets: minExclusive, maxExclusive, minInclusive, maxInclusive • xsd:boolean • xsd:hexBinary and xsd:base64Binary • xsd:anyURI • disjoint with xsd:string

  16. Contents • Introduction • The Datatype System of OWL 2 • The Datatypes of OWL 2 • A Modular Datatype Checker • Conclusion

  17. Modular Datatype Checking • We assume that all datatypes are disjoint • xsd:integer is understood as a facet of owl:real  provides us with a natural modularization boundary • Each datatype d needs a datatype handler: • mincd(d[], n) • true iff (d[])D contains at least n elements • enud(d[]) • defined only if (d[])D is finite • enumerates the extension of d[] • ind(c, d[]) • true iff cD2 (d[])D • eqd(c1, c2) • true iff c1D = c2D

  18. The Algorithm • Input: a conjunction  of assertions • Output: true iff the conjunction is satisfiable • Normalize  such that each variable x in it occurs in exactly one assertion d[](x) • Simplify  • delete from  assertions containing certain variables • in all remaining assertions of the form d[](x), the data range d[] is finite • Replace d[](x) with D(x) for D = enud(d[]) • Guess values for all variables • Check whether the guess satisfies  Can bereducedto SAT

  19. The Simplification Step • If  contains a variable x such that • x occurs in  in exactly one assertion d[](x), • x occurs in  in m assertions of the form x¼x’, • x occurs in  in n assertions of the form x¼c, and • mincd(d[], m+n+1) = true then delete in  all assertions containing x  If | (d[])D | ¸ m+n+1, then we can satisfy x for any choice of values for x’ • the constraints on x are irrelevant for the satisfiability of  • Key to practical reasoning: • data ranges in practice are likely to be large (even infinite)

  20. Handling Numbers and Strings • Numbers: • represent facets as intervals of the form dt(low, high)  facet expressions can be normalized using a suitable interval algebra • Strings: • represent facets as regular languages  facet expressions can be normalized using standard results for Boolean operations with regular languages • caveat: the underlying alphabet is infinite  need to adapt Boolean operations on regular languages • In both cases, datatype handlers are easily implemented for normalized expressions

  21. Contents • Introduction • The Datatype System of OWL 2 • The Datatypes of OWL 2 • A Modular Datatype Checker • Conclusion

  22. Conclusion • The algorithm has been implemented in the HermiT reasoner • a new OWL 2 reasoner based on hypertableau • http://www.hermit-reasoner.com/ • No formal evaluation yet, but… • Supporting datatypes did not noticeably change classification times • data ranges used in practice are often “large enough”

More Related