950 likes | 1.14k Views
Adaptive Query Processing with Eddies. Amol Deshpande University of Maryland. Roadmap. Adaptive Query Processing: Motivation Eddies [AH’00] STAIRs [DH’04] and SteMs [RDH’03] Experimental Study Implementation in PostgreSQL [Des’03] Continuous queries [MSHR’02] (very briefly) Open problems.
E N D
Adaptive Query Processing with Eddies Amol Deshpande University of Maryland
Roadmap • Adaptive Query Processing: Motivation • Eddies [AH’00] • STAIRs [DH’04] and SteMs [RDH’03] • Experimental Study • Implementation in PostgreSQL [Des’03] • Continuous queries [MSHR’02] (very briefly) • Open problems
Query Processing in Database Systems Declarative Query Database System Results We will focus on traditional select-project-join queries
Query Processing: Example select * from students, enrolled, courses where students.name = enrolled.name and enrolled.course = courses.course Database System Students Enrolled Courses
Students Enrolled Enrolled Courses Query Processing: Example select * from students, enrolled, courses where students.name = enrolled.name and enrolled.course = courses.course Courses Students Enrolled
ES EC CE SE Example Query: Execution Plans SEC SEC CE SE S C Students Courses C S E E Courses Enrolled Students Enrolled A Query Execution Plan An alternate Execution Plan
EC SE Cost-based Query Optimization Estimate cost of each plan and choose the best SEC Cost = g(|SE|, |C|, R) Input sizes + SE C Cost = f(|S|, |E|, R) Courses = S E Students Enrolled Runtime Parameters Cost (Plan) A Query Execution Plan
Cost-based Query Optimization Results Query Optimizer Query Executor Compiled Query Plan Declarative Query Disk(s)
Network Cost-based Query Optimization Results Query Optimizer Query Executor Compiled Query Plan Declarative Query Wide area data sources: e.g. remote tables, web data sources Disk(s)
Network Disk(s) Cost-based Query Optimization Results Query Optimizer Query Executor Compiled Query Plan Declarative Query Streaming data e.g. Stock tickers Network logs Sensor networks
EC Erroneous estimation of intermediate result sizes Input sizes may not be available SE Estimation Errors Cost = g(|SE|, |C|, R) SEC SE C Courses S E Students Enrolled A Query Execution Plan
EC Effect on the cost function may be unpredictable SE Estimation Errors Cost = g(|SE|, |C|, R) SEC SE Unknown runtime parameters C Courses S E Students Enrolled A Query Execution Plan
How to solve this problem ? • More sophisticated estimation techniques • Sophisticated summary structures • e.g. MHists [PI’97], Wavelets [VWI’98] • Feedback loop in the optimization process • e.g. [SLMK’01, BC’02] • Adaptive query processing • Can’t always build and maintain synopses • Runtime environments can be very unpredictable • So…adapt query plans mid-way during execution
Eddies: Extreme Adaptivity • Telegraph & TelegraphCQ (at UC Berkeley) • Eddies [AH’00] • SteMs [RDH’03] • Continuous queries [MSHR’02, CF’02, C+’03, K+’03] • Implementation in PostgreSQL [Des04] • Fault-tolerance and load balancing [SHB’04] • STAIRs [DH’04] • Other work • Distributed eddies, Content-based Routing [BB’05] per tuple static plans inter- operator late binding intra- operator Dynamic QEP, Parametric, Competitive Query Scrambling, MidQuery Re-opt Traditional DBMS XJoin, DPHJ Convergent QP Eddies
Roadmap • Adaptive Query Processing: Motivation • Eddies [AH’00] • STAIRs [DH’04] and SteMs [RDH’03] • Experimental Study • Implementation in PostgreSQL [Des’03] • Continuous queries [MSHR’02] (very briefly) • Open problems
Eddies [AH’00] select * from S where pred1(S) and pred2(S) Plans considered by the optimizer pred1(S) pred2(S) S Output pred2(S) pred1(S) S Output Decision made apriori based on statistics Sort by (1-s)/c, where s = selectivity, c = cost Once this decision is made, all tuples are processed using the same order
pred2(S) Eddy Output S pred1(S) Eddies [AH’00] select * from S where pred1(S) and pred2(S) Executing the query using an Eddy • An eddy operator • Intercepts tuples from source(s) and output tuples from operators • Query executed by routing tuples between the operators • Uses feedback from the operators to route Change routing ==> Change query execution plan used
Per-tuple State select * from S where pred1(S) and pred2(S) Executing the query using an Eddy pred2(S) Eddy Output S Two Bitmaps Ready bits - which operators can a tuple be routed to next Done bits - which operators has a tuple already been through pred1(S) Example: Ready(t1) = [1, 1] - can be routed to either Done(t1) = [0, 0] - not done either Example: Ready(t2) = [1, 0] - can be routed to pred1 Done(t2) = [0, 1] - done pred2 For selection queries, ready is a bit-complement of done
Eddies: Routing Policy • Choosing which operator to route a given tuple to • The brain of the eddy Pred2 is more selective Send here 99% of the time Send to the other operator 1% of the time Lottery Scheduling [Avnur 00] Simplified Description 1. Maintain for each operator: tuples sent tuples returned cost per tuple 2. Choose (roughly) based on the above 3. Explore by randomly sending tuples in the wrong orders sent = 100 received = 2 pred2(S) Eddy Output S pred1(S) sent = 30 received = 20
Students Enrolled Enrolled Courses A Join Query select * from students, enrolled, courses where students.name = enrolled.name and enrolled.course = courses.course Courses Students Enrolled
Output SE EC SE EC Eddy S Output E C C S E Eddies [AH’00] Query execution using an eddy A traditional query plan A key difference: Tuples can’t be arbitrarily routed to any operator E.g. S tuples can’t be routed to E Join C Use ready bits to identify this
Eddies w/ Joins • Traditional join operators typically consume one relation entirely and then start reading the second relation • E.g. hash join operator builds a hash table on one relation first, and then reads in the other relation • This is problematic for eddies • An eddy needs to see tuples from different relations in order to make its routing decisions • Also, if the inner relations are pre-decided, not much options left for adapting the join order • [Avnur, Hellerstein 00] discusses this issue in detail for traditional join operators
SE HashTable E.Name HashTable S.Name Symmetric Hash Join • We will use a new join operator called symmetric hash join operator • Also called doubly pipelined • Other variants include ripple joins, Xjoins (disk-based) When a new S tuple arrives: It is built into S.name hashtable Probed into E.name hash table to find matches with already arrived E tuples Matches are immediately output Symmetric Operation !! S E
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Query Execution using Eddies Probe to find matches Insert with key hash(joe) Eddy S E Output C No matches; Eddy processes the next tuple
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Query Execution using Eddies Probe Insert Eddy S E Output C
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Query Execution using Eddies Probe Eddy S E Output C Probe
Per-tuple State • Here also we need to keep track of what operators a tuple has already been through • Again use • Ready bits - operators that can be applied next • Done bits - operators that have already been applied • Unlike selections, these are not bit-complements of each other
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Per-tuple State Eddy S E Output C
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Per-tuple State Eddy S E Output C
SE EC HashTable E.Name HashTable S.Name HashTable C.Course HashTable E.Course Per-tuple State Eddy S E Output C
Can we talk about what exactly the eddy did during the execution ? Yes ! Execution Postmortem
Execution Postmortem Output Output EC E S SE Courses CE Students Students Enrolled Courses Enrolled Eddy executes different query execution plans for different parts of data
Can we talk about what exactly the eddy did during the execution ? Yes ! Eddy executes different plans for different parts of data This is where the adaptivity comes from Execution Postmortem
Routing policy • Lottery scheduling unfortunately doesn’t work well with joins • Just because a join operator does not return tuples right now doesn’t mean it won’t return more tuples later • In other words, a join operator is state-ful • Selection operators are state-less
|S E| |EC| ES EC CE SE Example: Delayed Data Sources SETUP: >> Execution plan 1 Execution plan 2 SEC SEC CE SE S C C E S E Cost (Plan 1) > Cost (Plan 2)
|S E| |EC| Example: Delayed Data Sources SETUP: E and Carrive early; Sis delayed >> S E C time
|S E| |EC| SE HashTable E.Name HashTable S.Name Eddy S E Output C HashTable C.Course HashTable E.Course Eddy decides to route E to EC EC SETUP: E and Carrive early; Sis delayed >> sent and received suggested (so far) that S Join E is better option for E tuples S0 S S E S0 S –S0 E C time C S0E SE (S –S0)E Eddy learns the correct sizes Too Late !!
State got embedded as a result of earlier routing decisions |S E| |EC| SE HashTable E.Name HashTable S.Name EC Eddy S E Output C HashTable C.Course HashTable E.Course SE EC SETUP: E and Carrive early; Sis delayed >> S E C C SE S E Execution Plan Used Query is executed using the worse plan. Too Late !!
Joins and Lottery Scheduling • Lottery scheduling doesn’t work well with joins • Not clear how any routing policy can work without reasonable knowledge of future • Whatever the current state in the join operators, an adversary can send tuples to make it look very bad • Two possible solutions: • Allow manipulation of state (STAIRs) [DH’04] • Don’t embed state in the operators (SteMs) [RDH’03]
Roadmap • Adaptive Query Processing: Motivation • Eddies [AH’00] • STAIRs [DH’04] and SteMs [RDH’03] • Experimental Study • Implementation in PostgreSQL [Des’03] • Continuous queries [MSHR’02] (very briefly) • Open problems
STAIRs [DH’04] • Expose join state to the eddy • Provide state management primitives • That guarantee correctness of execution • That can be used to manipulate embedded state in the operators • Also allow support for cyclic queries etc
New Operator: STAIR SE HashTable E.Name HashTable S.Name Eddy S Output E C HashTable C.Course HashTable E.Course EC
S.Name STAIR HashTable E.Name STAIR HashTable Eddy S Output E C HashTable HashTable E.Course STAIR C.Course STAIR New Operator: STAIR Storage, Transformation and Access for Intermediate Results
S.Name STAIR Build into S.Name STAIR HashTable E.Name STAIR HashTable Eddy S Output E C HashTable HashTable E.Course STAIR C.Course STAIR Query execution using STAIRS Similar to using Join Operators Probe into E.Name STAIR s1 s1 s1 s1
STAIR: Operations • Build (insert): • Insert the given tuple into the STAIR • Probe (lookup): • Find matching tuples for the given tuple • State Management Operations: • Demotion • Promotion
e1 e1 e2c1 e2 s1e1 e2c1 e2 s1e1 State Management Primitive: Demotion Replace a tuple in a STAIR with a projection of that tuple S.Name STAIR HashTable E.Name STAIR s1 Demoting e2c1 toe2 HashTable e1 e2 e2c1 Eddy S E Output C HashTable e2 s1e1 HashTable c1 E.Course STAIR Can be thought of as undoing work C.Course STAIR
Promotinge1 using EC e1 e1 e1c1 e1 e1c1 State Management Primitive: Promotion Replace a tuple in a STAIR with the result of joining it with other tuples S.Name STAIR • Two arguments: • A tuple • A join to be used to promote this tuple HashTable E.Name STAIR s1 HashTable e1 e1c1 e2c1 Eddy S E Output C HashTable e2 s1e1 HashTable c1 e1 E.Course STAIR Can be thought of as precomputation of work C.Course STAIR
STAIRs: Correctness • Theorem: For any sequence of applications of the state management operations, STAIRs will produce the correct query output. • STAIRs will produce every result tuple • There will be no spurious duplicates
|S E| |EC| SE HashTable E.Name HashTable S.Name Eddy S E Output C HashTable C.Course HashTable E.Course Eddy decides to route E to EC EC SETUP: E and Carrive early; Sis delayed >> S0 S E S0 E C time C S0E Eddy learns the correct selectivities