1 / 28

Minimizing View Sets without Losing Query-Answering Power

Minimizing View Sets without Losing Query-Answering Power. Chen Li Stanford University joint work with Mayank Bawa and Jeff Ullman. A web-caching scenario. user query. Client. cache. source query. answer. Server. Client. Cached query results: Q1(T,A,Pr) :- book(T,A,Pub,Pr)

enid
Download Presentation

Minimizing View Sets without Losing Query-Answering Power

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Minimizing View Sets without Losing Query-Answering Power Chen Li Stanford University joint work with Mayank Bawa and Jeff Ullman ICDT'2001, London, UK

  2. A web-caching scenario user query Client cache source query answer Server

  3. Client Cached query results: Q1(T,A,Pr) :- book(T,A,Pub,Pr) Q2(T,A,Pr) :- book(T,A,prenhall,Pr) Q3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) Source relation: Book(Title, Author, Pub, Price)

  4. What query results to remove? Book(Title, Author, Pub, Price) Cached query results: Q1(T,A,Pr) :- book(T,A,Pub,Pr) Q2(T,A,Pr) :- book(T,A,prenhall,Pr) Q3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) • Q2  Q1 • Remove Q2? Cannot answer query: • Q(T,Pr) :- book(T,smith,prenhall,Pr)

  5. How about removing Q3? Book(Title, Author, Pub, Price) Cached query results: Q1(T,A,Pr) :- book(T,A,Pub,Pr) Q2(T,A,Pr) :- book(T,A,prenhall,Pr) Q3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) Compute Q3 using Q2: Q3(A1,A2) :- Q2(T,A1,Pr1),Q2(T,A2,Pr2) We are not losing any query-answering power!

  6. Observations: • Traditional query-containment does not help [Chandra and Merlin, 1977] . • We should consider query-answering power. • General questions: • How to describe “query-answering power”? • How to minimize a view set without losing its query-answering power?

  7. Rest of the talk • Answering queries using views • Query-answering power • p-containment • Relationship with traditional query containment • Minimizing a view set • p-containment relative to a set of queries • Conclusion and open problems

  8. Answering queries using views • Conjunctive queries and views: h(X) :- g1(X1),…,gn(Xn) • Example: V1(T,A,Pr) :- book(T,A,Pub,Pr) V2(T,A,Pr) :- book(T,A,prenhall,Pr) V3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2)

  9. Query answerability • A query Q is answerable by a view set V if we can rewrite Q using views in V [LMSS95]. • Example: V2(T,A,Pr) :- book(T,A,prenhall,Pr) V3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) V3 is answerable by V2: V3(A1,A2) :- V2(T,A1,Pr1),V2(T,A2,Pr2)

  10. Algorithms • Bucket algorithm [LRO96] • Inverse-rule algorithm [DG97,Qia96] • MiniCon algorithm [PL00] • SVB algorithm [Mit99] • CoreCover Algorithm [ALU00] Testing whether a query is answerable by a set of views is NP-complete.

  11. Views are expensive to maintain • Require storage space. • Need to be kept up-to-date. We want to minimize a given view set while keeping its query-answering power.

  12. p-containment • A view set V is p-contained in another view set W if W can answer all the queries that are answerable by V. • “p” stands for “power.” • Denoted: V p W • Two view sets are equipotent,if V p W and Wp V. • They have the same power to answer queries.

  13. Example: V1(T,A,Pr) :- book(T,A,Pub,Pr) V2(T,A,Pr) :- book(T,A,prenhall,Pr) V3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) {v1,v2,v3}p {v1,v2} {v1,v2} p {v1,v2,v3} Therefore: {v1,v2,v3} and {v1,v2} are equipotent.

  14. Lemma: V p W iff each view in V can be answered by W. • Implies an algorithm for testing p-containment. • Assuming view sets are finite. • Theorem: Testing V p W is NP-complete.

  15. p-containment and query containment V1(T,A,Pr) :- book(T,A,Pub,Pr) V2(T,A,Pr) :- book(T,A,prenhall,Pr) V3(A1,A2) :- book(T,A1,prenhall,Pr1), book(T,A2,prenhall,Pr2) • Query containment does not imply p-containment {v1} and {v2} • p-containment does not imply query containment {v2} and {v3}

  16. Minimizing a view set • Keep removing views from the view set while retaining the equipotence. • Might have multiple equipotent minimals V1(A) :- r(A,B) V2(B) :- r(A,B) V3(A,B) :- r(A,X),r(Y,B) {V1,V2,V3} has two equipotent minimals: {V1,V2}, {V3}

  17. p-containment relative to queries Queries: Q={Q1,Q2,…} V = {V1,V2,…,Vm} W = {W1,W2,…,Wn} V is p-contained in W w.r.t. Q if the queries in Q that are answerable by V are also answerable by W.

  18. Example of relative p-containment Relations: car(Make,Dealer) loc(Dealer,City) Queries: Q1(D,C) :- car(toyota,D),loc(D,C) Q2(D,C) :- car(honda,D), loc(D,C) Views: V = {V1,V2}, V1 = Q1, V2 = Q2 W = {W1} W1(M,D,C) :- car(M,D),loc(D,C)

  19. Testing relative p-containment • Q is finite: test by the definition. • Q is infinite?

  20. Parameterized queries • Motivation: web search forms. • A PQ is a conjunctive query with placeholders. • Example: q(D) :- car($M,D),loc(D,$C) • Placeholders $M,$C, replaced by constants • Instances: q(D) :- car(toyota,D),loc(D,sf) q(D) :- car(honda,D),loc(D,pa) • The domain of each placeholder is infinite. • Thus, represent infinite number of queries.

  21. Q: q(D) :- car($M,D),loc(D,$C) • v1(M,D,C) :- car(M,D),loc(D,C) • Answer all instances of Q. • v2(M,D) :- car(M,D),loc(D,sf) • Answer some instances of Q. • Answerable instances of Q are instances of: q(D) :- car($M,D),loc(D,sf) • v3(M) :- car(M,D),loc(D,sf) • Answer no instances of Q.

  22. Assume queries are generated by one PQ; • Results easily extendable to the case with finite set of PQs. • Complete answerability of a PQ using views • V can answer all instances of a PQ Q. • Example: q(D) :- car($M,D),loc(D,$C) v1(M,D,C) :- car(M,D),loc(D,C)

  23. An algorithm for testing complete answerability • Replace each placeholder with a new distinct constant, get a canonical instance I; • Test if I is answerable by V. Example: PQ: q(D) :- car($M,D),loc(D,$C) View: v1(M,D,C) :- car(M,D),loc(D,C) Canonical instance: q(D) :- car(m0,D),loc(D,c0) Rewriting: q(D) :- v1(m0,D,c0)

  24. Partial answerability • Some instances of Q are answerable by V q(D) :- car($M,D),loc(D,$C) v2(M,D) :- car(M,D),loc(D,sf) • Theorem: All the answerable instances of a PQ using V are instances of a finite set of PQs, s.t. each of them is completely answerable by V. q(D) :- car($M,D),loc(D,sf)

  25. All instances of Q answerable instances PQ1 PQ2 a parameterized query Q … PQk V={V1,…,Vn} An algorithm for finding the finite set of PQs.

  26. Testing p-containment w.r.t. PQ • Find the PQs whose instances are all the instances of Q that are answerable by V. • For each of the PQs, test if it is completely answerable by V. • Details are in the paper.

  27. Conclusion • Introduced p-containment, which is different from query containment. • Showed how to minimize a view set without losing query-answering power. • Developed an algorithm for testing relative p-containment w.r.t. instances of PQs. • Extended to MCR-containment.

  28. Open problems • Find a view subset with lowest “cost.” • If views are not given, find the best views to materialize.

More Related