100 likes | 114 Views
Bin Liu, Yali Zhu, Mariana Jbantova, Brad Momberger, and Elke A. Rundensteiner. Dynamically Adaptive Distributed System for Processing CompleX Continuous Queries. Presented by Yali Zhu Department of Computer Science Worcester Polytechnic Institute U.S.A. VLDB’05 August 31 st 2005.
E N D
Bin Liu, Yali Zhu, Mariana Jbantova, Brad Momberger, and Elke A. Rundensteiner Dynamically Adaptive Distributed System for Processing CompleX Continuous Queries Presented by Yali Zhu Department of Computer Science Worcester Polytechnic Institute U.S.A VLDB’05 August 31st 2005
Uncertainties in Stream Query Processing Register Continuous Queries High workload of queries Real-time and accurate responses required Streaming Data Stream Query Engine Streaming Result May have time-varying rates and high-volumes Available resources for executing each operator may vary over time. Memory- and CPU resource limitations Distribution and Run-time Adaptations are required.
DAX (D-CAPE) System Architecture Query Processor Distribution Manager Local Plan Migrator Connection Manager Local Statistics Gatherer Local Adaptation Controller Global Plan Migrator CAPE-Continuous Query Processing Engine Query Plan Manager Runtime Monitor Repository Data Distributor Data Receiver Global Adaptation Controller Repository Streaming Data Network End User Stream Generator Application Server
Distributed Adaptation Techniques • Workload Relocation • Operator-level • Partition-level • Query Plan Reshaping • Data Spilling
Initial Distribution Distribution Manager Machine 1 1 1 2 2 3 4 5 5 7 8 Stream Source 6 6 1 2 3 3 4 4 5 7 7 8 8 Machine 2 Application 6
Workload Relocation – Operator-level Machine 1 Distribution Manager 1 1 2 2 3 4 5 5 Stream Source 7 8 6 6 1 2 3 3 4 4 5 7 7 8 8 Application Machine 2 6
Workload Relocation – Partition-level • Problem of operator-level adaptation: • Operators have large states. • Moving them across machines can be expensive. • Solution as partition-level adaptation: • Partition state-intensive operators [Gra90,SH03,LR05] • Distribute Partitioned Plan into Multiple Machines m2 m1 • How to partition and relocate multi-way joins at run time? SplitA SplitB SplitC A B C
Dynamic Plan Reshaping and Migration Distribution Manager Distribution Manager op2 op1 Migration Protocol op1 op2 11-way handshaking op3 op4 op3 op4 M2 M1 M1 M2 op2 op1 op2 op1 op2 op1 op2 op1 op3 op4 op3 op4 op3 op4 op3 op4 • How does the protocol guarantees correct query results? • How to integrate with across-machine workload relocation?
A B C Secondary Storage State Spill • Push part of operator state onto disk • Quick relief of memory overflow problem A B C • How to keep high run-time query throughput? • How to integrate with across-machine workload relocation?
Summary • Key Words • Distributed system • Continuous queries (multi-way joins) • Various unique run-time adaptation techniques • Demo Sessions:Wednesday 2-3:30Friday 9-10