350 likes | 457 Views
Adaptively Processing Remote Data. Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems February 28, 2005. Administrivia. Next reading assignment: Doan et al. – LSD Recall that the midterm will be due 3/16
E N D
Adaptively Processing Remote Data Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems February 28, 2005
Administrivia Next reading assignment: • Doan et al. – LSD • Recall that the midterm will be due 3/16 • You can go ahead and choose a topic – let me know which one
Sources of Query Answering Cost • Regular computation (this has a minimum cost) • But we can get held up by: • Inflexible query plans • Delays – we “stall” in waiting for I/O • Query scrambling • Pipelined hash joins • Bad query plans • Bad estimation of intermediate result sizes • The focus of the Kabra and DeWitt paper • Insufficient source information • Eddies and ADP • Exploration vs. exploitation; extrapolating performance
Kabra and DeWitt Recap • Provides significant potential for improvement without adding much overhead • Biased towards exploitation, with very limited information-gathering • A great way to retrofit an existing system • In SIGMOD04, IBM had a paper that did this in DB2 • But not appropriate for remote data • Relies on us knowing the (rough) cardinalities of the sources • Query plans aren’t pipelined
A Second Issue: Delays • May have a very computationally inexpensive plan, but slow sources • The query plan might get held up waiting for data • Solution 1: query scrambling • Rescheduling – while delayed, find a non-executing part of the query plan and start it • Operator synthesis – when nothing can be rescheduled, might tinker with the original plan • Want to do this in a cost-based way
Cost-Based Query Scrambling • Divide plan into runnable subtrees based on schedule; schedule those with 75% efficiency • Each may be run out of order if it materializes • Cost = mat. write + processing + mat. read • Efficiency = (M – MR) / (P + MW): savings / cost • When no more runnable subtrees, need to do something • Operator synthesis: try to find a computation that will mask the delay – only how much should we do? • PAIR: only do a single join over a pair of relations • IN: include delay, chooses to defer delayed access to end • ED: estimated delay, progressively increases the delay
Overall Assessment • Hard to estimate how much work to do! • A trade-off between aggressive and conservative strategies • If enough resources, it doesn’t matter…
Solution 2: Pipelined Hash Joins • They can dynamically adapt to delays, alleviating the need for techniques like query scrambling (except between different pipeline stages) • Disadvantages: • Memory – what happens if we run out • Focus of [Ives+99], [Urhan+00] • Cost – a query plan with sub-optimal order will still be very expensive! • The central piece of the next two techniques we’ll look at…
Extreme Adaptivity: Eddies • The basic idea: • Query processing consists of sending tuples through a series of operators • Why not treat it like a routing problem? • Rely on “back-pressure” (i.e., queue overflow) to tell us where to send tuples • Part of the ongoing Telegraph project at Berkeley • Large-scale federated, shared-nothing data stream engine • Variations in data transfer rates • Little knowledge of data sources
Telegraph Architecture • Simple “pre-optimizer” to generate initial plan • Creates operators, e.g.: • Select: predicate(sourceA) • Join: predicate(sourceA, sourceB) • (No support for aggregation, union, etc.) • Chooses implementations Generally relies on pipelined hash joins • Goal: dataflow-driven scheduling of operations • Tuple comes into system • Adaptively routed through operators in “eddies” May be combined, discarded, etc.
The Eddy • Represents set of possible orderings of a subplan • Each tuple may “flow” through a different ordering (which may be constrained) • N-ary module consisting of query operators • Basic unit of adaptivity • Subplan with select, project, join operators U
Example Join Subplan Alternatives Join(R3R1.x = R2.x, JoinR1.x = R3.x(R1, R3)) R1.x = R3.x R1.x = R3.x R1 R2.x = R3.x R3 R2.x = R1.x R2 R3 R2 R1 R1.x = R3.x R1.x = R3.x … R3 R2.x = R3.x R1 R2.x = R3.x R1 R2 R2 R3
Naïve Eddy • Given tuple, route to operator that’s ready • Analogous to fluid dynamics • Adjusts to operator costs • Ignores operator selectivity
Lottery-Based Eddy • Need to favor more selective operators • Ticket given per tuple input, returned per output • Lottery scheduling based on number of tickets • Now handles both selectivity and cost
Enhancing Adaptivity: Sliding Window • Tickets were for entire query • Weighted sliding window approach • “Escrow” tickets during a window • “Banked” tickets from a previous window
What about Delays? Problems here: Don’t know when join buffers vs. “discards” tuples T S R (SLOW)
Eddy Pros and Cons • Mechanism for adaptively re-routing queries • Makes optimizer’s task simpler • Can do nearly as well as well-optimized plan in some cases • Handles variable costs, variable selectivities • But doesn’t really handle joins very well – attempts to address in follow-up work: • STeMs – break a join into separate data structures; requires re-computation at each step • STAIRs – create intermediate state and shuffle it back and forth • Pre-optimization? Distribution (DeWitt et al.)? Beyond joins?
Generalizing Adaptive Query Processing • We’ve seen a range of different adaptive techniques • How do they fit together? • Can we choose points between eddies and mid-query re-optimization?
Types of Adaptive Query Processing Adaptive scheduling (q. scrambling [UF98], dyn. pipeline sched. [UF01], XJoin [UF00], PH Join [RS86][I+99], ripple join [HH99]) Changes CPU scheduling to improve feedback or reduce delays Redundant computation (competitive exec. [AZ96]) Compare two+ ways of executing the query Plan partitioning ([S+76][KD98][I+99][M+04]) Break the plan into stages; re-optimize future stages as necessary Adaptive info passing ([IT05 sub.]) Pass data between parts of an executing plan Adaptive data partitioning • Break the data into subsets; use a different plan for each subset • The only way to reduce overall computation with fine granularity • First (only) implementation has been eddies[AH00][R+03][DH04]
Eddies Combine Adaptive Scheduling and Data Partitioning Decisions Intuitively, each tuple gets its own query plan • Route to next operator based on speed and selectivity of each operator • Elegant and simple to implement But performing a join creates subresults at the next level! Local & greedy choices may result in state that needs to join with all future data! Consider long-term effects of decisions before making them – separate CPU scheduling from plan selection
Focusing Purely on Adaptive Data Partitioning Use adaptively scheduled operators to “fill CPU cycles” Now a query optimizer problem: Choose a plan that minimizes long- term cost (in CPU cycles) To allow multiple plans, distribute union through join (and select, project, etc.): If R1 = R11[ R12, R2 = R21[ R22 then: R1⋈ R2 = (R11[ R12) ⋈ (R21[ R22) = (R11⋈ R21) [ (R12⋈ R22) [ (R11⋈ R22) [ (R12⋈ R21) R11 R21 R22 R12 R2 R1 This generalizes to njoins, other SPJ + GUoperators…
Exclude R0S0T0,R1S1T1 R1 S1T1 S1T1 È T 0 T 0 1 1 T R R S 0 0 Exclude R0S0 0 0 1 1 R S S T R 0 S 0 R S 1 1 Adaptive Data Partitioning:Routing Data across Different Plans R ⋈S ⋈ T R0 S0T0 Options for combining across phases: • New results always injected into old plan • Old results into new plan • Wait until the end – “stitch-up” plan based on best stats … R0 S0 R S T
Special Architectural Features for ADP Monitoring and re-optimization thread runs alongside execution: • System-R-like optimizer with aggregation support;uses most current selectivity estimates • Periodic monitoring and re-optimization revises selectivity estimates, recomputes expected costs Query execution with “smart router” operators Special support for efficient stitch-up plans: • Uses intermediate results from previous plans (specialized-case of answering queries using views [H01]) • Join-over-union (“stitch-up-join”) operator that excludes certain results
ADP Application 1:Correcting Cost Mis-estimates Goal: react to plans that are obviously bad • Don’t spend cycles searching for a slightly better plan • Try to avoid paths that are likely to not be promising Monitor/reoptimizer thread watches cardinalities of subresults • Re-estimate plan cost, compare to projected costs of alternatives, using several techniques & heuristics (see paper) • Our experiments: re-estimate every 1 sec. “Smart router” operator does the following: • Waits for monitor/reoptimizer to suggest replacement plan • Re-routes source data into the new plan • New plan’s output is unioned with output of previous plan; this is fed into any final aggregation operations
Correcting for Unexpected Selectivities Pentium IV 3.06 GHzWindows XP
ADP Application 2:Optimizing for Order Most general ADP approach: • Pre-generate plans for general case and each “interesting order” • “Smart router” sends tuple to the plan whose ordering constraint is followed by this tuple • But with multiple joins, MANY plans Instead: do ADP at the operator level • “Complementary join pair” • Does its own stitch-up internally • Easier to optimize for! Can also do “partial sorting” at the router (priority queue) Q ... Merge Hash h(R) h(S) h(R) h(S) Q Q Q Q Routers R S
Exploiting Partial Order in the Data Pentium IV 3.06 GHzWindows XP (1024 tuple)
SUM(T.y) GROUP BY T.x T R SUM(T.y sums) GROUP BY T.x R SUM(T.y) GROUP BY T.x, T.joinAttrib T ADP Over “Windows”:Optimizing for Aggregation • Group-by optimization [CS94]: • May be able to “pre-aggregate” some tuples before joining • Why: aggregates can be applied over union • But once we insert pre-aggregation, we’re stuck (and it’s not pipelined) • Our solution: • “Adjustable window pre-aggregation” • Change window size depending on how effectively we can aggregate • Also allows data to propagate through the plan – better info for adaptivity, early answers vs.
Adaptive QP in Summary A variety of different techniques, focusing on: • Scheduling • Comparison & competition • Information passing • Data + plan partitioning A field that is still fairly open – missing: • Effective exploration methods • A true theory! • What’s possible? What kinds of queries make sense to adapt? • Guarantees of optimality and convergence (perhaps under certain assumptions)