810 likes | 983 Views
RDF Semantics. Acknowledgement. This presentation is based mostly on the W3C Recommendation RDF Semantics available at http://www.w3.org/TR/rdf-mt/ . Much of the material in this presentation is verbatim from the above Web site. Outline of the Presentation.
E N D
RDF Semantics Knowledge Technologies Manolis Koubarakis
Acknowledgement • This presentation is based mostly on the W3C Recommendation RDF Semantics available at http://www.w3.org/TR/rdf-mt/ . • Much of the material in this presentation is verbatim from the above Web site. Knowledge Technologies Manolis Koubarakis
Outline of the Presentation • Knowledge representation languages: syntax, semantics (model theory) and proof-theory. • The semantics of RDF: • Interpretation • Simple interpretation • RDF-interpretation • RDFS-interpretation • Truth, satisfaction, entailment, validity • Some interesting theorems that hold about these semantic concepts. Knowledge Technologies Manolis Koubarakis
KR Languages: Syntax and Semantics • A KR language is defined by specifying its syntax and semantics. • The syntax of a KR language specifies the well-formed formulas and sentences. • The semantics of a KR language defines a correspondence between formulas/sentences of the language and facts in the world to which these formulas/sentences refer. • A sentence of a KR language does not mean anything by itself. The semantics or meaning of a sentence must be provided by its writer by means of an interpretation. • The notion of world (captured by the formal notion of an interpretation) is crucial here. Knowledge Technologies Manolis Koubarakis
Some Important Notions • Truth and Satisfaction. A sentence will be called true under a particular interpretation (equivalently: the interpretation satisfies the sentence or the interpretation is a model of the sentence) if the state of affairs it represents is the case. • Entailment. We will say that the sentences of KBentail the sentence Φ (equivalently: Φlogically follows from the sentences of KB)if the following holds: whenever the sentences of KB are true (i.e., in any interpretation that the sentences of KB are true), then the sentence Φ is also true. • Satisfiability. We will say that a sentence Φis satisfiable if there is at least one interpretation that satisfies it. Otherwise, Φ will be called unsatisfiable. • Validity. We will say that a sentence Φis valid if it is satisfied by every interpretation. Knowledge Technologies Manolis Koubarakis
Some Important Notions (cont’d) • The notions of interpretation, truth, satisfaction, entailment, satisfiability and validity are the core notions of model theory of a knowledge representation language. • Question: Given a knowledge base KB and a sentence Φ, how do we design an algorithm that verifies whether KBentails Φ? Knowledge Technologies Manolis Koubarakis
Inference, proof and proof-theory • Inference is the process of mechanically (i.e., algorithmically) deriving sentences entailed by aknowledgebase. • An inference mechanism is called sound if it derives only sentences thatare entailed. • An inference mechanism is called complete if it derives all thesentences that are entailed. • The steps used to derive a sentence Φ from a set of sentences KB iscalled a proof. • A proof theoryis a set of rules for deriving the entailments of a setof sentences. Knowledge Technologies Manolis Koubarakis
Some KR Languages • In introductory Logic, Artificial Intelligence or Logic Programming courses, you might have seen: • Propositional Logic • First-order Logic • For both of these, you should have seen: • Syntax • Semantics • Proof theory • Now we will explore the same themes for RDF and RDF Schema. Knowledge Technologies Manolis Koubarakis
Outline of the Presentation • Knowledge representation languages: syntax, semantics (model theory) and proof-theory. • The semantics of RDF: • Interpretation • Simple interpretation • RDF-interpretation • RDFS-interpretation • Truth, satisfaction, entailment, validity • Some interesting theorems that hold about these semantic concepts. Knowledge Technologies Manolis Koubarakis
RDF Semantics • RDF and RDF Schema are languages for representing knowledge in the Web. • In the previous lecture, we presented several syntaxes of RDF and RDF Schema, and discussed their semantics informally. • We will give semantics for RDF and RDF Schema in a formal way, following the traditional way of giving semantics to a knowledge representation (KR) language. Knowledge Technologies Manolis Koubarakis
RDF Syntax • We have already presented the syntax of RDF by introducing notation for : • URI references, literals (plain, plain with language information, typed), blank nodes, triples and graphs. • We also presented serializations of RDF graphs: RDF/XML, Turtle etc. Knowledge Technologies Manolis Koubarakis
RDF Semantics: What it Does not Capture • Several aspects of meaning in RDF are ignored by the semantics we will give: • We treat URI references as simple names, ignoring aspects of meaning encoded in particular URI forms. • We do not provide any analysis of time-varying data or of changes to URI references. • Some parts of the RDF and RDFS vocabularies are not assigned any formal meaning. • In some cases, notably the reification and container vocabularies, we assign less meaning than one might expect. Knowledge Technologies Manolis Koubarakis
Initial Definitions • Let s, p be URIs and o be a URI or literal. Then s p o . is an RDF triple. • An RDF graph, or simply a graph, is a set of RDF triples. • A subgraph of an RDF graph is a subset of the triples in the graph. A proper subgraph is a proper subset of the triples in the graph. • A ground RDF graph is a graph with no blank nodes. • A name is a URI reference or a literal. Names are the simplest kind of expressions that need to be assigned a meaning by an interpretation. • Atyped literal consists oftwonames: itself and its internal type URI reference. • A set of names is referred to as a vocabulary. • The vocabulary of a graph is the set of names which occur as the subject, predicate or object of any triple in the graph. • Note:URI references which occur only inside typed literals are not required to be in the vocabulary of the graph. Knowledge Technologies Manolis Koubarakis
Interpretations • All interpretations will be relative to a set of names, called the vocabulary of the interpretation (so, strictly speaking, we give an interpretation of an RDF vocabulary, rather than of RDF itself). • Some interpretations may assign special meanings to the symbols in a particular vocabulary (e.g., to the symbol rdf:type in the vocabulary of RDF). • Interpretations which share the special meaning of a particular vocabulary will be named for that vocabulary. We will consider only • rdf-interpretations and • rdfs-interpretations in this presentation, but one can imagine other interpretations been defined based on new vocabulary that is introduced and needs to be treated specially. • An interpretation with no particular extra conditions on a vocabulary (including the RDF vocabulary itself) will be called a simple interpretation, or simply an interpretation. Knowledge Technologies Manolis Koubarakis
Simple Interpretations A simple interpretationIof a vocabularyV is defined by: • A non-empty set IR of resources, called the domain or universe of I. • A distinguished subset LV of IR, called the set of literal values, which contains all the plain literals in V. • A set IP, called the set of properties of I.IPis not necessarily disjoint fromIR. • A mappingIS from URI references in V into (IRunionIP). • A mappingIL from typed literals in V into IR. • A mappingIEXT from IP into the powerset of IR x IR i.e., the set of sets of pairs <x,y> with x and y in IR . Knowledge Technologies Manolis Koubarakis
Discussion: Literal Values • The assumption that LV is a subset of IR amounts to saying that literal values are thought of as real entities that 'exist'. • This amounts to saying that literal values are resources. • However, this does not imply that literals should be identified with URI references. • Note that LV may contain other items in addition to plain literals (e.g., for typed literals). Knowledge Technologies Manolis Koubarakis
Discussion: Properties • Since properties are resources, the set of properties IPwill typically be a subset ofIR. • The set of properties IP is non-standard (e.g., compared with the notion of interpretation in FOL). Why do we need it? Knowledge Technologies Manolis Koubarakis
Higher Order Features of RDF • RDF does not impose any logical restriction on the application of a property or its value. Since properties are resources, properties can be applied to properties. In particular, a property may be applied to itself. Examples: rdf:type rdfs:domain rdfs:Resource rdf:type rdf:type rdf:Property • In RDFS, classes may contain themselves. Example:rdfs:Class rdf:type rdfs:Class • Such membership loopscan violate the axiom of foundation, one of the axioms of standard (Zermelo-Fraenkel) set theory, which forbids infinitely descending chains of set membership. • How can we avoid these higher-order features? Knowledge Technologies Manolis Koubarakis
Higher-Order Syntax but First-Order Semantics • We distinguish between: • Propertiesorclassesconsidered as objects.This is done by having the set IP. • Extensions of properties or classes(i.e., the sets of object-value pairs which satisfy the property, or things that are inthe class). This is done by having the mapping IEXT for properties and ICEXT for classes (to be given later when rdfs-interpretations are defined). • In this way, we can have the extension of a property or class to contain the property or class itself without violating the axiom of foundation. • This technique is known to logicians for quite some time. See: • Herbert B. Enderton. A Mathematical Introduction to Logic, Academic Press, 1972. (Section 4.4). Knowledge Technologies Manolis Koubarakis
Example @prefix dc: <http://dublincore.org/2008/01/14/dcterms.rdf#>. @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. @prefix ex: <http://example.org>. ex:atwood rdf:type foaf:Person; foaf:name “Margaret Atwood”; foaf:homepage <http://margaretatwood.ca/>; dc:creator <http://en.wikipedia.org/wiki/The_Handmaid’s_Tale> . Knowledge Technologies Manolis Koubarakis
Example (cont’d) Let us consider only the following triples: ex:atwood foaf:name “Margaret Atwood”. ex:atwood foaf:homepage <http://margaretatwood.ca/> . ex:atwood dc:creator <http://en.wikipedia.org/wiki/The_Handmaid’s_Tale> . The vocabulary of the above triples is V={ex:atwood, foaf:name, “Margaret Atwood”, foaf:homepage, <http://margaretatwood.ca/> , dc:creator, <http://en.wikipedia.org/wiki/The_Handmaid’s_Tale> }. Knowledge Technologies Manolis Koubarakis
A simple interpretation I of V • I is defined by specifying the sets IR, LV and IP and the mappings IS, IL and IEXT. Knowledge Technologies Manolis Koubarakis
A simple interpretation I (cont’d) • IR={ , , , “Margaret Atwood” } Knowledge Technologies Manolis Koubarakis
A simple interpretation I (cont’d) • LV = { “Margaret Atwood” } • IP = { Namehood, Homepagehood, Creatorhood } • We have not included IP in IR; we do not need this in our example. • The mappings IS, IL and IEXT are defined as in the following slides. Knowledge Technologies Manolis Koubarakis
IS ex:atwood <http://margaretatwood.ca/> IS IS Knowledge Technologies Manolis Koubarakis
IS (cont’d) <http://en.wikipedia.org/wiki/The_Handmaid’s_Tale> IS Knowledge Technologies Manolis Koubarakis
IS (cont’d) • IS(foaf:name)=Namehood • IS(foaf:homepage)=Homepagehood • IS(dc:creator)=Creatorhood Knowledge Technologies Manolis Koubarakis
IL and IEXT • We have no typed literals, so nothing needs to be said about IL. • IEXT(Namehood)= { < , “Margaret Atwood” >} Knowledge Technologies Manolis Koubarakis
IEXT (cont’d) • IEXT(Homepagehood)= { < , >} Knowledge Technologies Manolis Koubarakis
IEXT (cont’d) • IEXT(Creatorhood)= { < , > } Knowledge Technologies Manolis Koubarakis
Denotations • We will now define how an interpretation of a vocabulary determines the truth-values of any RDF graph, by a recursive definition of the denotation(i.e., the semantic value) of any RDF expression in terms of those of its immediate subexpressions. • RDF has two kinds of denotation: names denote things in the universe, and sets of triples denote truth-values. Knowledge Technologies Manolis Koubarakis
Denotations of Literals and URI References • If E is a plain literal"aaa" in V then I(E) = “aaa”. This means that plain literals denote themselves. • If E is a plain literalcarrying language information"aaa"@ttt in V then I(E) = <“aaa”, ttt>. • If E is a typed literal in V then I(E) = IL(E). • If E is a URI reference in V then I(E) = IS(E). Knowledge Technologies Manolis Koubarakis
Denotations of Ground Triples and Graphs • If E is a ground triples p o. then: • If s, p and o are in V, I(p) is in IP and <I(s),I(o)> is in IEXT(I(p))thenI(E) = true. • Otherwise,I(E)= false. • If E is a ground RDF graph then: • If I(E') = false for some triple E' in E, then I(E) = false • Otherwise,I(E) =true. Knowledge Technologies Manolis Koubarakis
An Alternative Notation • In other presentations of semantics, the symbol of the interpretation is shown as an exponent to the vocabulary element when a denotation is written down: • EI instead of I(E) Knowledge Technologies Manolis Koubarakis
Example (cont’d) • Which ones of the following ground triples are true for the simple interpretation I specified earlier and why? • ex:atwood foaf:name “Margaret Atwood”. • ex:atwood foaf:homepage <http://margaretatwood.ca/> . • ex:atwood foaf:homepage “Margaret Atwood” . • ex:margaret foaf:name “Margaret Atwood” . Knowledge Technologies Manolis Koubarakis
A Note on Typed Literals • Simple interpretations treat typed literals differently than plain literals because typed literals carry extra semantics that an interpretation will typically capture. • Example: The typed literal “27”^^http://www.w3.org/2001/XMLSchema#integer will typically be mapped to the integer 27 by IL and 27 will be included in IR. Knowledge Technologies Manolis Koubarakis
Semantics for Blank Nodes • Blank nodes are like existentially quantified variables in FOL:they indicate the existence of a thing, without using, or saying anything about, the name of that thing. • How can we specify the truth-value of a graph containing blank nodes using interpretations? Knowledge Technologies Manolis Koubarakis
Semantics for Blank Nodes (cont’d) • Let Ibe an interpretationand Abe a mapping from some set of blank nodes to the universe IR of I. • Define I+A to be an extended interpretation which is like I except that it uses A to give the interpretation of blank nodes. • Define blank(E) to be the set of blank nodes in graph E. Knowledge Technologies Manolis Koubarakis
Semantics for Blank Nodes (cont’d) • If N is a blank node and A(N) is defined then [I+A](N) = A(N). • If E is an RDF graph with blank nodesthen I(E) = true if [I+A'](E) = true for some mapping A' from blank(E) to IR, otherwise I(E)= false. Knowledge Technologies Manolis Koubarakis
Satisfaction and Entailment • Definition. Interpretation Isimplysatisfiesgraph E(notation: I ╞ E) if I(E)=true. • Definition.A set S of RDF graphs simply entails a graph E (notation: S╞ E) if every interpretation which satisfies every member of S also satisfies E. • Note: In the above definitions, we write “simply” because we will define stronger notions of satisfaction and entailment later on after introducing stronger notions of interpretation. Knowledge Technologies Manolis Koubarakis
Some Results • Lemma. Given a non-empty RDF graph, there is always a simple interpretation that satisfies it. • There are no simply unsatisfiable non-empty RDF graphs. Knowledge Technologies Manolis Koubarakis
Proof • Let G be a non-empty RDF Graph. We construct a simple Herbrand interpretation of G, written Herb(G), as we show on the next slide. • The idea of Herbrand interpretations is known from first-order logic. Knowledge Technologies Manolis Koubarakis
Proof (cont’d) • LVHerb(G)is the set of all plain literals in G. • IRHerb(G)is the set of all names and blank nodes which occur in a subject or object position in a triple in G. • IPHerb(G)is the set of URI references which occur in the property position of a triple in G. • IEXTHerb(G)is the set {<s,o>: G contains a triple s p o . } • ISHerb(G) and ILHerb(G) are both identity mappings on the appropriate parts of the vocabulary of G. Knowledge Technologies Manolis Koubarakis
Some Results • Empty Graph Lemma. The empty set of triples is simply entailed by any graph, and does not simply entail any graph except itself. • Subgraph Lemma. A graph simply entails all its subgraphs. Knowledge Technologies Manolis Koubarakis
Instances of Graphs • Definition. Let Mbe a mapping from a set of blank nodes to some set of literals, blank nodes and URI references. Then, any graph obtained from a graph G by replacing some or all of the blank nodes N in G by M(N) is an instance of G. • Easy results: • Any graph is an instance of itself. • An instance of an instance of G is an instance of G. • If H is an instance of G then every triple in H is an instance of some triple in G. • Definition. A proper instance of a graphG is an instance in which a blank node has been replaced by a name, or two blank nodes in the graph have been mapped into the same node in the instance (in other words, an instance with fewer blank nodes than G). Knowledge Technologies Manolis Koubarakis
Some Results (cont’d) • Instance Lemma. A graph is simply entailed by any of its instances. • Is the opposite true? Knowledge Technologies Manolis Koubarakis
Simple Equivalence • Definition. Two RDF graphs G1 and G2 will be called simplyequivalent if and only if G1 (simply) entails G2 and G2 (simply) entails G1. • Proposition: Two RDF graphs are simply equivalent if and only if each is an instance of the other but neither is a proper instance. • Simply equivalent graphs differ only in the identity of their blank nodes. Knowledge Technologies Manolis Koubarakis
The Merge of Two Graphs • A merge of a set of RDF graphs is defined as follows: • If the graphs in the set have no blank nodes in common, then the union of the graphs is a merge. • If the graphsshare blank nodes, then a merge is the union of a set of graphs that is obtained by replacing the graphs in the set by equivalent graphs that share no blank nodes. This is often described by saying that the blank nodes have been standardized apart. • Result: Any two merges are simply equivalent. • So we will refer to the merge, following the convention on equivalent graphs. Using the convention on equivalent graphs and identity, any graph in the original set can be considered to be a subgraphof the merge. Knowledge Technologies Manolis Koubarakis
Relationship Between Merging and Entailment • Merging lemma. The merge of a set S of RDF graphs is simply entailed by S, and simply entails every member of S. • This means that a set of graphs can be treated as simply equivalent to its merge, i.e. a single graph, as far as the model theory is concerned. Knowledge Technologies Manolis Koubarakis
The Interpolation Lemma • Interpolation Lemma.Let S be a set of RDF graphs. Ssimply entails a graph E if and only if a subgraph of the merge of S is an instance of E. Knowledge Technologies Manolis Koubarakis