200 likes | 325 Views
Query Processing in Connectivity-Challenged Environments. Priyanka Puri Sharma Chakravarthy Gururaj Poornima Mohan Kumar Information Technology Laboratory Computer Science and Engineering Department The University of Texas at Arlington, Arlington, TX 76009 Email: sharma@cse.uta.edu
E N D
Query Processing in Connectivity-Challenged Environments Priyanka Puri Sharma Chakravarthy Gururaj Poornima Mohan Kumar Information Technology Laboratory Computer Science and Engineering Department The University of Texas at Arlington, Arlington, TX 76009 Email: sharma@cse.uta.edu URL: http://itlab.uta.edu/sharma
This effort is supported by AFRL under Contract Number: FA8750-09-2-0199 • Sanjay Madria and Raytheon (Waseem Naqvi) are also involved in this project Sharma: AF Mobility Workshop
Query Processing • Has been addressed in the context of centralized DBMSs • Has been addressed in the context of distributed DBMSs • Cost-based plan generation is typically used • So, is there anything more/new to do? Sharma: AF Mobility Workshop
UAV 2 UAV 4 UAV 3 Ground Controller 2 UAV 1 UAV 5 Ground Controller 1 Ground Controller n Sharma: AF Mobility Workshop
UAV 6 UAV 2 UAV 3 Ground Controller 2 UAV 1 UAV 5 Ground Controller 1 Ground Controller n Sharma: AF Mobility Workshop
Currently • Data is dumped into a central server and queried • Bandwidth, QoS issues are not addressed • No collaboration among nodes • No continuous query processing, notification, fusion, context usage, and real- or near real-time support Sharma: AF Mobility Workshop
Proposed long-term Architecture Limited Resources Mobility Heterogeneity Disconnections Network of computing nodes: Unmanned vehicles, Sensors, Robots, PCs , Servers, Ground Controlling devices Queries, Tasks, Requests, Continuous Queries Publish/Subscribe SOA Distributed Middleware Task planning Join computation Composition pub/sub Context-aware Notification Resource Management Data management Context/ Knowledge Base Fault Tolerance Services Query Capability Publish Subscribe Capability Local fusion/Materialization Raw Data / fused data /data from other nodes Sharma: AF Mobility Workshop
Query Processing Sharma: AF Mobility Workshop
MyObjects Table at each node Cardinality (number of tuples) , Selectivity, replication site of data are known (part of meta data) Sharma: AF Mobility Workshop
Query Plan Format Sharma: AF Mobility Workshop
Operations in Plan format Sharma: AF Mobility Workshop
Plan using Semijoin chains R1 [1000] R2 [5000] R3 [3000] 1 2 SELECT c1 R1 MOVE R11 To Site2 SELECT c2 R2 SJ R11 R21 : J1 MOVE J1 To Site3 SELECT c3 R3 SJ J1 R31 : J2 MOVE J2 To Site2 SJ J2 R21 : J3 MOVE J3 To Site1 SJ J3 R11 : J4 COPY R To Site7 :J Total Cost= 14720 + 32000 = 46720 3 select project select project select project R21[3000] [lat] R11[800] R31[600] [long] J1[1200] J2[240] Cost=3200 Cost=4800 [long,nodeid] 7 Cost=1920 J3[1200] [lat,nodeid] Cost=4800 J4[320] J Cost=32000 Sharma: AF Mobility Workshop
Semi-join/join plan generation • We are developing algorithms for generating the plan space and pruning it for generating “best” (or “good”) plan for each input query (expressed as a join query) • It is a cost-based algorithm based on System R and SDD approaches extended to include connectivity and bandwidth issues • The complexity of plan generation is kn ; n is number of joins and k is the number of alternatives for each join. • Assuming less than 5 joins in a query • Integrate replication into the algorithm Sharma: AF Mobility Workshop
Plan Generation Alternatives • A Query Plan (QP) is a numbered sequence of operations for executing a Query • A QP includes how data is moved as part of execution • Plan generation alternatives • Static Plan: generated once and executed in a distributed manner • Dynamic plan: generated incrementally at each node as the query progresses using current connectivity information • Parallel plan: partial plans are executed in parallel • Interactive plan: get some estimate by asking nodes that has relevant data Sharma: AF Mobility Workshop
Static plan • The physical plan generated will have node information for data propagation. • This will be mapped to “actual connectivity” by the physical layer for execution • It is possible that no connectivity exists by the time execution is performed for a generated query plan • In that case, either a new plan can be generated (using the same algorithm, but using current meta data) or an alternative approach can be used to incrementally modify the plan Sharma: AF Mobility Workshop
Dynamic plan • Generate plan for the first join and defer the rest of the plan • Join plans are generated one at a time • Current connectivity information can be used • Result size estimation will also be more accurate • Query execution and (partial) plan generation are intertwined • Does not increase the complexity of plan generation or plan execution (compared to static) Sharma: AF Mobility Workshop
Parallel plan • All local operations/computations (select, project, and even some joins) can be done in parallel • Join plans are still generated one at a time • Increases message/information exchange • Current connectivity information can be used • Result size estimation will also be more accurate • Deal with responses and plan generation and execution may be slightly more complicated than the previous cases Sharma: AF Mobility Workshop
Interactive plan • When a query comes in, send out requests for local processing and get processing time and size information • Use the above to generate partial plans • Join plans are still generated using information obtained interactively • Increases message/information exchange • Current connectivity information can be used • Result size estimation will also be more accurate • Combines Dynamic and parallel execution in an interactive manner Sharma: AF Mobility Workshop
Replication Issues • Algorithm for Replication • Single copy replication that “minimizes” the data transmission cost and “maximizes” the number of paths (to deal with connectivity) • Algorithm for Replication utilization • Given a replication, determine the utility of that replica in terms of query evaluation cost for a reasonable load • Reconcile the above two to come up with a replication strategy that balances the competing tradeoffs Sharma: AF Mobility Workshop
Thank You ! Sharma: AF Mobility Workshop