1 / 22

Containment

Containment. CSE 590 DB Rachel Pottinger. Outline. Introduction Motivation Formal definition Algorithms for different complexities An application: rewriting queries using views. Containment, what is it?.

Download Presentation

Containment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Containment CSE 590 DB Rachel Pottinger

  2. Outline • Introduction • Motivation • Formal definition • Algorithms for different complexities • An application: rewriting queries using views

  3. Containment, what is it? • For two queries, Q1 and Q2, if all of the answers to Q1 are a subset of those for Q2 for all databases, then Q1 is contained in Q2. • Denoted as Q1 Q2. • For general datalog, this is undecidable (by reduction from decision problems for context free languages)

  4. Why should I care? • Containment is useful in a number of situations, including: • Query minimization • Independence of queries using updates • Rewriting queries using views • Interesting logic problem

  5. More definitions • Equivalence of queries: Q1Q2 if they return the same answers for all databases. This is the same as Q1 Q2 and Q2 Q1 • Conjunctive query - a query that is formed only of conjunctions of predicates. • Q(X,Y):- e(X,Z),e(Z,Y)

  6. Containment Mapping • Let Q1 and Q2 be two conjunctive queries • Q1: I :- J1, …, Jl • Q2: H :- G1, …, Gk • A symbol mapping h is said to be a containment mapping if h turns Q2 into Q1; that is, h(H)= I, and for each i = 1,2,…,k, there is some j such that h(Gi)=Jj. There is no requirement that each Jj be the target of some Gi

  7. Proof Sketch • If there’s a containment mapping from Q2 to Q1, then Q1 Q2 • Suppose  maps Vars(Q2)Vars(Q1) • Let D be a database and  be an answer •  is a mapping from Vars(Q1) D • •  Vars(Q2) D • The rest of the proof follows later

  8. Example of homomorphism rules • Q1: fp(X,Y) :- e(Y,X), e(X,Z) • Q2: fp(A,B) :- e(B,A), e(C,A),e(A,D) • For Q1 Q2, map from Q2 to Q1

  9. Test for containment of a conjunctive query (Q1Q2) • Freeze the body of Q1, and put this into a canonical database • Apply Q2 to the canonical database • If Q1 can be derived from Q2 on the canonical database, then Q1 Q2, otherwise not

  10. A chilling example Q1: p(X,Z) :- a(X,Y), a(Y,Z) Q2: p(X,Z) :- a(X,U), a(V,Z) Canonical Database of Q1

  11. Proof continued • If Q1 Q2,then there is a containment mapping • Since Q1 Q2, we know that if we apply Q2 to the canonical database formed from Q1, we’ll get back the same fact we got from applying it to Q1, which makes a mapping from Q2 to Q1.

  12. Conjunctive queries with negation • Negation in the heads of the subgoals, ie: Q(X,Y):- e(X,Z),e(Z,Y) • The Levy and Sagiv test looks at an exponential number of canonical databases, thus is P2 complete • Consider all partitions of Q1; form canonical databases for all of them, D1, … Dk • For each database Di, see if the database makes all subgoals of Q1 true. • For all Di’s passing step 2, see if it the head of Q1 can be derived by applying Q2 • If so, then Q1 Q2, else not

  13. A negative example • Q1: p(X,Z):-a(X,Y), a(Y,Z), a(X,Z) • Q2: p(A,C):-a(A,B),a(B,C), a(A,D)

  14. Conjunctive Queries with Arithmetic Comparisons • Q(X,Y):-e(X,Z),e(Z,Y), Z < Y • Treat the same as the negated subgoals, only a check must be made for each ordering of each partition • Also P2 complete for dense domain such as reals

  15. Example with arithmetic comparisons • Q1:p(X,Z):-a(X,Y), a(Y,Z), X < Y • Q2:p(A,C):-A(A,B),A(B,C), A < C • false, see x = z = 0, y = 1

  16. Other complexity results •  queries restricted to queries Q1 and Q2 such that all database predicates have arity at most 2 and every database predicate occurs at most three times in the body of Q1 - P2 • Conjunctive queries where Q1 is fixed- NP complete • Conjunctive queries where Q2 is fixed - polynomial • Conjunctive query containment where Q2 is an acyclic query - polynomial time • Conjunctive queries where every database predicate occurs at most twice in the body of Q1 - linear time

  17. Rewriting Queries Using Views • Useful in query optimization • Good for query minimization • Needed to make the best use of cached information • Necessary in data integration

  18. Views • A view is a relation that is not part of the conceptual model, but is visible to the user. • Useful for common expressions, or protecting data • Example: If you had faculty(name, office, ssn) you may want students to access faculty_office(name, office)

  19. Views (con’t.) • Views can be either materialized or virtual • In data integration, data sources can be thought of as views

  20. An example of rewriting queries using views • Suppose you had two databases: • One has famous people and whether they are right or left handed • One has the birthdays of famous people • You want the birthdays of all of the lefties

  21. Containment in rewriting • Query of q(X):-e(X,Y), e(Y,X) • View of v(A,B):- e(A,C),e(C,B)

  22. A more complicated example • Q(x,u):-p(x,y),p0(y,z),p1(x,w),p2(w,u) • V1(a,b):-p(a,c),p0(c,b),p1(a,d) • V2(a,b):-p1(a,b) • V3(a,b):-p2(a,b)

More Related