180 likes | 277 Views
Query languages II: equivalence & containment (Motivation: rewriting queries using views). conjunctive queries – CQ’s Extensions of CQ’s. Conjunctive queries –equivalence & containment. For CQ’ q1, q2, with the same head predicate: Decision problems :
E N D
Query languages II: equivalence & containment(Motivation: rewriting queries using views) conjunctive queries – CQ’s Extensions of CQ’s conjunctive-ii
Conjunctive queries –equivalence & containment For CQ’ q1, q2, with the same head predicate: Decision problems: The two problems are equivalent: solved one, solved the other conjunctive-ii
Solution for containment for equivalence : Solution for equivalence for containment: (here, the ri and sj are db predicates, not necessarily different) conjunctive-ii
Characterizations for containment : assume q1, q2 are given A mapping h from the variables of q2 to variables/constants (extended naturally to constants and atoms) is a homomorphism from q2 to q1 if • Maps head(q2) to head(q1) (assuming same heads identity on head vars) • Maps each atom of q2 to an atom of q1 • If there are constrains on the side, Ci in qi, then h(C2) is implied by C1 Notation: conjunctive-ii
Thm: The following are equivalent: for CQ’s w/o built-in preds Proof: (ii) (i) is easy (and holds even with b.i. preds): Every valuation from q1 into a db D can be composed with h to a valuation from q2. Hence, every answer of q1 on D is also an answer of q2 on D h v D conjunctive-ii
For (i) (ii): The body of a CQ (w/o b.i’s) can be viewed as a db: • consider each variable as a constant, different from all constants in the CQ and the other variables • or, replace each variable x by a distinct constant cx Denote this db by db(q) Obviously, q(db(q)) contains the head of q (or its image) Example: Q: q(d) :- movies(t,d,a), directory(‘Plaza’, t, 19:30) db(Q): movies(ct,cd,ca), directory(‘Plaza’, ct,19:30) Obviously, applying Q to this db, one obtains q(cd) (use the “identity” valuation) conjunctive-ii
(ii) (q2 contains q1 homomorphism from q2 to q1) Clearly, q1(db(q1)) contains head(q1) Since , q2(db(q1)) contains head(q1) The valuation from q2 to db(q1) that yields this answer is a homomorphism Example: q1: p(d) :- movies(t,d,’Jane’), directory(‘Plaza’, t, 19:30), location(‘Plaza’, a, 01-58776655) q2: p(z) :- movies(t,z,a), directory(‘Plaza’, t, 19:30) Obviously, q1 is contained in q2, with h: t t, zd, a’Jane’, that maps the two atoms of body(q2) to the first two of body(q1), and head(q2) to head(q1) conjunctive-ii
Because of this characterization, such a homomorphism is also called a containment mapping from q2 to q1 Intuition: q1 is contained in q2 iff • It has ‘same or more atoms’ • It may have some constants where q2 has variables conjunctive-ii
Another characterization: For a rule p(..) :- r1(..), …, rk(..) a model is a set of facts over p, r1, .., rk that satisfies the rule as a logical formula (assuming all variables are universally quantified) Thm: the following are equivalent: The important & useful characterization: homomorphism, i.e., containment mapping conjunctive-ii
Algorithm and complexity : • To decide if q1 is contained in q2, search for a containment mapping from the variables of q2 to the variables and constants of q1: easy & fast in many cases, exponential in worst case • The containment is in NP: given a mapping on the variables of q2 , it is easy to check it is a homomorphism to q1 conjunctive-ii
It is NP-hard: given a graph G, it is 3-colorable iff there is a homomorphism from G (represented as an edge relation) to the 3-clique one can represent G as the body of q2 (using distinct variables for distinct nodes), the 3-clique as the body of q1 for both, the head can be q( ) • Hence, containment & equivalence are NP-complete(even for queries with no head variables) Note: this is expression complexity, not data complexity (here there is no db actually) *(when such a query is applied to a db, it returns either {()}, or {}) * conjunctive-ii
Minimization of CQ’s: For q, define a minimal equivalent query as any equivalent q’ with a minimal number of body atoms Thm: the minimal equivalent query of q • is unique up to isomorphism, • and can be obtained by removing some atoms from body(q) Proof: conjunctive-ii
Thus, for every CQ Q, there is a subset of the body that gives a minimal equivalent query Called a core of Q It is not necessarily unique, (different subsets may yield cores), but all cores are isomorphic conjunctive-ii
Containment & equivalence for extensions of CQ’s Extension to UCQ’s : let Thm: Proof: is obvious : if q1 is contained in q2, then each ri is contained in q2 • q2(db(ri)) contains p(x) • for some sj, sj(db(ri)) contains p(x) sj contains ri q1: r1: p(x) :- body1,1 … rk: p(x):- body1,k q2: s1: p(x) :- body2,1 … sm: p(x):- body2,m conjunctive-ii
Containment algorithm : For each ri, loop over sj, and search for a containment mapping from sj to ri Still exponential in size (of both queries) Complexity : The containment problem is now Explanation: A relation R(..) is ptime if membership can be verified in ptime conjunctive-ii
For a UCQ Q we can also consider the canonical db of Q, denoted db(Q), obtained by taking the bodies of all the rules together as a db (with different existential variables in different rules ) Here also: Thm: Q1 is contained in Q2 iff Q2(db(Q1)) contains head(Q1) (this also gives an algorithm for checking containment, which boils down to finding containment mappings) conjunctive-ii
Another extension of CQ’s: b.i. preds in the body Example: Q1: p(x, y) :- q(x, y), r(u, v) , u <= v Q2: p(x, y) :- q(x, y) , r(u,v), r(v, u) Is Q2 contained in/equivalent to Q1? Q2 is equivalent to the union of Q2,1: p(x, y) :- q(x, y) , r(u,v), r(v, u), u<= v Q2,2: p(x, y) :- q(x, y) , r(u,v), r(v, u), v< u Clearly, Q2,1 and Q2,2 are both contained in Q1 This can be generalized to an algorithm that reduces containment to that of UCQ’s (omitted) conjunctive-ii
Containment of a UCQ Q and a (recursive) Datalog program P: Still decidable, but double exponential time(upper & lower bound) Here also: Thm: P contains Q iff P(db(Q)) contains head Q this gives an algorithm for checking containment: apply P to db(Q), see if you obtain head(Q) (do you see exponentials in this algorithm?) Containment of Datalog programs : undecidable conjunctive-ii