350 likes | 640 Views
Query Folding. Xiaolei Qian Presented by Ram Kumar Vangala. Query Folding. Query Folding refers to the activity of determining if and how a query can be answered using a given set of resources. Resources can be views or cached results of previous queries. Why Query Folding.
E N D
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala
Query Folding • Query Folding refers to the activity of determining if and how a query can be answered using a given set of resources. • Resources can be views or cached results of previous queries.
Why Query Folding • The base relation referred to in a query might be stored remotely and accessing it might be expensive • Accessing the database might not be possible because of network problem( disconnected). • Database might be conceptual but not physically available.
Query folding Used for • Query optimization in centralized database • Query processing in distributed database • Query answering in federated database.
Example • Patients (patient_id, clinic,dob,insurance) • Physician (physician_id,clinic,pager_no) • Drugs (drug_name,generic) • Notes (note_id,patient_id,physican_id,note_text) • Allergy (note_id,drug_name,allergy_text) • Prescription (note_id,drug_name,prescription_text)
Suppose that the database maintains materialized views defined as • CREATE VIEW Drug_Allergy (patient_id,drug_name,text) SELECT patient_id, drug_name, allergy_text FROM Notes, Allergy WHERE Notes.note_id=Allergy.note_id
General query • A user might use the following query to get the patient ids who are allergic to drug xd_2001. • SELECT patient_id,allergy_text FROM Patients,Notes, Allergy WHERE Patients.patients_id=Notes.patient_id AND Notes.note_id=Allergy.note_id AND clinic=palo_alto AND drug_name=xd_2001
Folded Query Using View • SELECT patient_id,text FROM Patients, Drug_Allergy WHERE Patients.patient_id=Drug_Allergy.patient_id AND clinic=palo_alto AND drug_name= xd_2001 • This query is more efficient than the original query
Query containment is special case of Query folding • The problem of containment for conjunctive queries is known as NP-complete. • NP-Complete: Toughest problems which do not have perfect solution
Conjunctive Queries • Queries which are result of project-select-join where the selection condition are restricted to equality. • Conjunctive Query form: h:- p1,…….,pn Where h,p1,..,pn are atomic formulas whose arguments are variables or constants, h is the head, and p1,…,pn is the body.
Variables in the head are distinguished and also appear in the body. • X, Y distinguished variables • W, U other variables • A, B constants • Example of conjunctive query • q(X,Y) :- patients(X,palo_alto,W1,W2), notes(W3,X,W4,W5), allergy(W3,xd_2001,Y)
Hypergraph Representation • A hypergraph is a set of nodes • A hypergraph is a graph where edges can connect any number of vertices • Conjunctive query can be represented by a hypergraph. • A conjunctive query is said to be acyclic if its hypergraph is acyclic. • Example: q(X,Y):- notes(W1,X,W2,W3), allergy(W1,Y,W4), notes(W5,X,W6,W7), prescription(W5,Y,W8)
The example computes patients X and drugs Y such that X is prescribed to Y and is treated with allergy to Y.
Query-Folding Problem • Folding Rules Let Q be a query, and R={R1,…,Rn} be a set of resources. We assume that no two resources have the same resource predicate, and there are no variables in common between Q and Ri or between Ri and Rj for 1≤i, j≤n
Folding types • Partial folding • Strong folding • Partial Folding: A partial folding of Q using R is a conjunctive query Q’ such that Q’ Q and the body of Q’ contains one or more resource predicate defined in R.
Strong Folding A strong folding of Q using R is a partial folding Q’ of Q using R such that Q Q’ A strong folding of a query is a partial folding that contains the original query.
Example: r1(X1,X2,X3):- notes(U1,X1,U2,U3), allergy(U1,X2,X3) r2(Y1,Y2,Y3,Y4):-notes(V1,Y1,Y2,V2), prescription(V1,Y3,V3), drugs(Y3,Y4). Where X,Y distinguished variable U,V other variables A complete folding of the above example will be as follows: q(X,Y) :-r1(X,Y,W),r2(X,W1,Y,W2).
Query Folding Algorithm • Let Q be a query, GQ be the hypergraph representing Q, and F be a set of folding rules. Then the query folding algorithm computes complete or partial folding of Q using F. • Two steps: • Initialization • Folding Generation
Initialization: • Compute labels for every hyperedge in GQ • Given hyperedge e GQ and conjunct p assosiated with e, its label Le is a relation with attributes var(p). For every F f such that p unifies with head(F). with most general unifier , there is a tuple in Le consisting of two parts: tuple var(p) and expression body (F) ,where second part is used to store folding of p.
Folding Generation • Construct set of folding by u-joining the labels of all the hyperedges in an arbitrary order.
Query Folding for Acyclic Queries • Existence of Folding • Pairwise consistency is necessary but not sufficient for the existence of foldings of cyclic queries. Example: q(X,Y):-patients(W1,W2,W3,W4), notes(X,W1,W5,Y), physician(W5,W2,W6) with resources r1(X1,X2) :-patients(B1,A1,U1,U2), notes(X1,B1,C1,X2),physician(C1,A2,U3)
r2(Y1,Y2):-patients(B2,A2,V1,V2), notes(Y1,B2,C2,Y2), physician(C2,A1,V3) Example:
Theorem: There exists a complete folding of acyclic query Q using folding rules F iff no hyperedges in reduction(GQ) have empty labels.
Example : consider an acyclic query which computes notes from clinics with allergic reactions. • q(X,Y):- allergy(X,W1,W2), drug(W1,W3), notes(X,W4,W5,W6), patients(W4,Y,W7,W8) • Resources: • r1(X1,X2):-allergy(X1,U1,U2),drugs(U1,X2),notes(X1,U3,U4,U5) • r2(Y1,Y2):- notes(Y1,V1,V2,V3),patients(V1,Y2,V4,V5),drugs(V6,V7)
Theorem: There does not exist a partial folding of acyclic query Q using folding rules F iff every hyperedge in reduction (GQ) has a singleton label.
Resources: • Folding Rules:
Conclusion • Query folding can be used in centralized databases • Queries can be answered using views instead of base relations. • In multiple queries, the result of a query can be used to partially answer another query. • In client server application, views can be cached.