280 likes | 426 Views
Annotated RDF. Octavian Udrea Diego Reforgiato Recupero V.S. Subrahmanian University of Maryland. Motivation. Many RDF extensions for specific scenarios: Temporal (Gutierrez et. al 2005) Uncertainty (Dubois et. al 2005, Straccia et al. 2005) Provenance (Carroll et. al 2005)
E N D
Annotated RDF Octavian Udrea Diego Reforgiato Recupero V.S. Subrahmanian University of Maryland
Motivation • Many RDF extensions for specific scenarios: • Temporal (Gutierrez et. al 2005) • Uncertainty (Dubois et. al 2005, Straccia et al. 2005) • Provenance (Carroll et. al 2005) • Can we construct a common syntax and semantics for RDF extensions? • Together with efficient query mechanism
Foundations of aRDF • Annotations are partial orders (A,≤) • Afuzzy ,Atime , Atime-intervals , Apedigree • Cartesian products can generate others • Such as Afuzz-time = Afuzzy X Atime • Builds on annotated logic (Kifer et al. 1992)
aRDF syntax Set of annotated triples (r,p:a,v)
aRDF syntax We’re .9 sure that Max had Adam as an advisor until 2004
aRDF satisfying interpretation • We consider transitive properties as a simple inference capability • A mapping I from the universe of possible triples (r,p,v) to A • A satisfying interpretation I for O has: • For all (r,p:a,v) in O, a ≤ I(r,p,v) • For all paths on transitive properties, the lower bounds of the set of annotations is less than I(r,p,v) • Entailment defined in the usual way
Satisfying interpretation example (0.9,2003) ≤ I(Max,hasSupervisor,William)
Satisfying interpretation example No matter what we assign to I(Mary,hasSupervisor,William), I will not satisfy O
aRDF consistency • The existence of a satisfying I: • For (r,p:ai,v), the set {ai} has an upper bound • Let Ak(r,p,v) be the set of annotations on the kth p-path from r to v (for transitive p) • The set B = {LB(Ak)} has an upper bound
aRDF consistency results • All RDF instances annotated with partial orders with top elements are consistent • For general partial orders, consistency verification runs in O(p *(n3 * e + n*a2))
aRDF atomic queries • (R,P:A,V) where at most one is variable • Examples: • (Max, ?p:(0.8,2002), William) • (Mary, hasSupervisor:(0.7,2002),?v) • (r,p:a,v) and (r’,p’:a’,v’) are semi-unifiable if there is a substitution θ: • r θ = r’ θ, p θ = p’ θ, v θ = v’ θ
aRDF atomic query answers • The answer to (R,P:A,V) is the set of (r,p:a,v) such that: • (r,p:a,v) is semi-unifiable with (R,P:A,V) and A ≤ a (where applicable) • (r,p:a,v) is entailed by the aRDF ontology • (r,p:a,v) is not entailed by a subset of the answer • The minimal (w.r.t. entailment) set of triples entailed by the theory that semi-unifies with the query
aRDF atomic query examples Query: (Max,?p:(0.8,2002), William) Answer: {(Max, hasSupervisor:(0.9,2003), William)}
aRDF atomic query examples Query: (Mary,hasSupervisor:(0.7,2002), ?v) Answer: {(Mary, hasAdvisor:(0.7,2003), William)}
aRDF theory closure • At each step, add to O one of: • For (r,p:a1,v), (r,p’:a2,v), p’ is a subProperty of p (or p = p’), add (r,p:a,v), where a is a minimal upper bound for a1,a2 • Add (r,p:a,v) for (r,p’:a1,r’), (r’,p’’:a2,v), where • p’,p’’ are subProperty* of p • For all a’, (a’ ≤ a1) and (a’ ≤ a2) => (a’ ≤ a) • Monotonic operator => there exists a fixpoint lfp(O)
Naïve query answer algorithm • Compute closure lfp(O) • Choose semi-unifiable triples with annotations “above” the query’s • Eliminate any triples entailed by subsets
atomicAnswerX algorithms • lfp(O) can be exponential • But the minimal set we look for in the answer is not • atomicAnswerV computes the answer to (R,P:A,V?) queries • atomicAnswerP computes the answer to (R,P?:A,V) queries • conjunctAnswer answers conjunctions of atomic queries
atomicAnswerX algorithms • atomicAnswerV: For the maximal transitive p-paths starting at r, compute: • The lower bound(s) on the sets of annotations • The least upper bound(s) of the previous set • atomicAnswerP: Similar approach for the maximal paths between r and v
atomicAnswerX complexity • atomicAnswerV (and R) are running in time O(n2 * e + n * e * a2) • O(n2 * e + n * e * a2) when annotation is a complete lattice • atomicAnswerP is has the same worst-case complexity • atomicAnswerA is O(n * e * a2) • Complexity results given for finite partial orders • For lattices, the “a” factors dissapear
Experimental results • Existing RDF ontologies with randomly generated annotations • Synthetically generated data up to 100,000 nodes • Also varied number of properties, node degree, number of transitive properties, etc.
Applications • We have started using aRDF on the STORY project • http://om.umiacs.umd.edu • An online aRDF system will be released in August 2006 • Features such as graphical editing and annotation, custom annotations, view maintenance
Conclusions • We have presented a general framework for extending RDF • Based on annotated logic • Simple syntax and semantics • Query algorithms are very efficient in practice