150 likes | 334 Views
Three Theses of Representation in the Semantic Web. Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk. Peter F. Patel-Schneider Bell Labs Research Murray Hill, NJ, USA pfps@research.bell-labs.com. Semantic Web Languages.
E N D
Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider Bell Labs Research Murray Hill, NJ, USA pfps@research.bell-labs.com
Semantic Web Languages • SemWeb aims to make content accessible to automated processes • Add semantic markup (meta-data) describing content/function of resources • Need a common way of providing meta-data so that: • It can be understood and manipulated by automated processes (“agents”) • Agents can integrate meta-data from different sources • Proposed solution is famous language “layer cake”:
Language Architecture • Relationship between adjacent layers not clear • XML $RDF relationship purely syntactic • RDF$Ontology layer relationship should be something more? • RDF is proposed as base for SemWeb languages • Used to add metadata annotations to resources • Also used to define syntax and semantics of subsequent layers • Not clear that RDF is appropriate for all these functions • Limited set of syntax constructs (triples) • Not possible to extend syntax (as it is, e.g., when using XML) • Uniform semantic treatment of triple syntax • Non standard KR thesis and model theory • May facilitate development of SemWeb to use more standard KR thesis…
Ontology Language Layer • Ontologies set to play key role in SemWeb • source of shared and precisely defined terms for use in meta-data • RDF already extended to RDFS • Hierarchies of classes and properties • Domain and range constraints on properties • More expressive ontology languages clearly required • With logical connectives, quantifiers, transitive properties, etc. • E.g., OIL, DAML+OIL, and nowOWL • Possible choices for language layering: • Base ontology language layer(s) on RDF(S) • Base ontology language layer(s) on “classical” FOL • Base ontology language layer(s) on SKIF/Lbase/CL languages
Semantics and Model Theories • Ontology/KR languages aim to model (part of) world • Constructs in language correspond to entities in world • Meaning given by mapping to some formal system • E.g., a logic such as FOL with its own well defined semantics • or a data model such as XQuery data model for XML • or (for more expressive languages) a Model Theory (MT) • MT defines relationship between syntax and interpretations • Can be many interpretations (models) of one piece of syntax • Models supposed to be analogue of (part of) world • E.g., elements of model correspond to objects in world • Formal relationship between syntax and models • Structure of models must reflect relationships specified in syntax • Inference (e.g., entailment) defined in terms of MT • E.g., A ² B iff every model of A is also a model of B
FOL Thesis • Base SW languages on established FO hierarchy • Propositional logic • Decidable FOL subsets (e.g., DL, Horn) • Undecidable FOL subsets • Full FOL (and even HOL) • Higher layers extend syntax • Upwards compatibility, i.e., syntax retains same meaning in higher layers • Semantics via FOL mapping or standard FO model theory • Individual i! element of domain (iI2D) • Class C! sets of elements (CIµD) • Property P! binary rel on D (PIµD£D)
(Dis)advantages of FOL Thesis • Pros • Based on well known and extensively studied formalism • Wealth of theoretical knowledge and practical experience • Family of sub-languages with well known formal properties • E.g., decidability, complexity • Highly optimised reasoners for FOL and many sub-languages • E.g., DL reasoners, Horn (rule) reasoners, FOL provers • Mapping to FOL provides easy integration, e.g., of DL and Horn languages • FO subset of RDFS fits well in this framework • Cons • No classes as instances (unless extended to HOL) • Relatively poor fit with full RDFS • Can be axiomatised in FOL, but may damage semantic interoperability and computational properties
Axiomatisation • An Axiomatisation can be used to embed RDFS in FOL, e.g.: • Triple xPy translated as holds2(P,x,y) • Axioms capture semantics of language, e.g.: • Problems with axiomatisations include • May require large and complex set of axioms • Difficult to prove semantics have been correctly captured • Axiomatisation may greatly increase computational complexity • RDFS ! undecidable (subset of) FOL • No interoperability unless all languages similarly axiomatised • E.g., in DAML+OIL, CsubClassOfD equivalent to 8x.C(x) !D(x) • But have to axiomatise as holds2(subClass, C, D)
SKIF/Lbase/CL Thesis • Base SW languages on SKIF/Lbase/CL • Similar to FOL thesis, but FOL replaced with CL • Higher layers extend syntax • Upwards compatibility, i.e., syntax retains same meaning in higher layers • Semantics via mapping into CL • CL provides model theory • Individual i! element of domain (iV2D) • Class C! element of domain (CV2D) • Property P! element of domain (PV2D) Second mapping (ext) • Class elt w ! set of elts (ext(w) µD) • Prop elt k ! binary rel (ext(P)µD£D)
(Dis)advantages of CL Thesis • Pros • Classes as individuals without HOL extension • Can use as a basis for a family of sub-languages • Mapping to CL provides easy integration of sub-languages • Better fit with RDFS • Cons • Relatively new and untried • Little known about CL sub-languages • Confusion w.r.t. FOL compatibility • RDFS still requires axiomatisation due, e.g., to rdf:type being in domain of discourse • Still no direct semantic interoperability with RDFS • Computational pathway only via (performance-damaging) FOL mapping
Confusion w.r.t. FOL Compatibility • SKIF/Lbase/CL use same syntax as FOL • But allow variables to occur in predicate positions • Originally asserted that SKIF semantics coincide with FOL for well formed FOL sentences • Subsequently shown to be wrong for FOL with equality • E.g., • Moral of the story • May confuse users more familiar with classical FOL • Easy to make mistakes with complex new formalisms • Risky to base future of SemWeb on such a new formalism
RDF Thesis • All SW languages based on triples • Triple based syntax • Semantics compatible with semantics of triples as defined by RDF MT • Upwards & downwards compatibility • Syntax retains same meaning in higher layers • Higher layer syntax is valid in lower layers • Semantics via RDF model theory • Similar to CL, but only binary predicates • Language syntax also in domain of discourse • Higher layers impose additional constraints on models • Syntax must be encoded as triples • Awkward for complex constructs • Resulting triples also have meaning
(Dis)advantages of RDF Thesis • Pros • (Supposed) interoperability between language layers • RDF tools can be used to parse all SW languages into triples • Large ontologies/KBs can be stored in triple DBs • Cons • Achieving real (semantic) interoperability may be difficult or impossible • E.g., efforts to layer OWL on top of RDF(S) • Triple encoding of complex languages such as OWL is very clumsy • Triples introduced by encodings have semantic consequences • E.g., first-rest triples used in list syntax have same consequences as ground facts (even though ordering of list may be arbitrary) • Not clear if technique can be extended to more expressive languages • E.g., full FOL • Computational pathway only via (performance-damaging) FOL mapping
Summary • Formal meaning of SW languages crucial to interoperability • Common semantic underpinning facilitates layered architecture • Widely assumed that RDF will provide this underpinning • But layering on top of RDF(S) may be difficult/impossible and does not lead to any direct computational pathway • Moreover, benefits are not clear • Alternative would be to use standard FOL as underpinning • Well established and well understood • Established family of languages capturing different trade-offs • Direct computational pathway for FOL and many sub-languages • FO subset of RDF(S) would fit well in this framework • Third approach is to use CL as underpinning • Relatively new and untested • May not solve problems with RDF(S)
Perhaps we should consider recalling the Semantic Web bandwagon in order to carry out a safety modification on the RDF component!