1 / 19

A Fault-Tolerant Environment for Large-Scale Query Processing

A Fault-Tolerant Environment for Large-Scale Query Processing. Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State University. Motivation. “big data” problem

Download Presentation

A Fault-Tolerant Environment for Large-Scale Query Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State University HiPC’12 Pune, India

  2. Motivation HiPC’12 Pune, India • “big data” problem • Walmart handles 1 million customer transaction every hour, estimated data volume is 2.5 Petabytes. • Facebook handles more than 40 billion images • LSST generates 6 petabytes every year • massive parallelism is the key

  3. Motivation * taken from Jeff Dean’s talk in Google IO (http://perspectives.mvdirona.com/2008/06/11/JeffDeanOnGoogleInfrastructure.aspx) HiPC’12 Pune, India • Mean-Time To Failure (MTTF) decreases • Typical first year for a new cluster* • 1000 individual machine failures • 1 PDU failure (~500-1000 machines suddenly disappear) • 20 rack failures (40-80 machines disappear, 1-6 hours to get back)

  4. Our Work * rack: a number of machines connected to the same hardware (network switch, …) HiPC’12 Pune, India • supporting fault-tolerant query processing and data analysis for a massive scientific dataset • focusing on two specific query types: • Range Queries on Spatial datasets • Aggregation Queries on Point datasets • supported failure types:single-machine failures andrack failures

  5. Our Work HiPC’12 Pune, India Primary Goals high efficiency of execution when there are no failures (indexing if applicable, ensuring load-balance) handling failures efficiently up to a certain number of nodes (low-overhead fault tolerance through data replication) a modest slowdown in processing times when recovered from a failure (preserving load-balance)

  6. Range Queries on Spatial Data Y query X master query query query worker worker worker data data data HiPC’12 Pune, India • nature of the task: • each data object is a rectangle in 2D space • each query is defined with a rectangle • return intersecting data rectangles • computational model: • master/worker model • master serves as coordinator • each worker responsible for a portion of data

  7. Range Queries on Spatial Data Y chunk 1 chunk 2 worker chunk 3 chunk 4 worker X * actual number of chunks depends on chunk size parameter. HiPC’12 Pune, India • data organization: • chunk is the smallest data unit • create chunks by grouping data objects together • assign chunks to workers in round-robin fashion

  8. Range Queries on Spatial Data o4 sorted objects: o1, o3 , o8, o6 , o2 , o7 , o4 , o5 3 2 o2 o7 o5 o1 o8 chunk 1 chunk 2 chunk 3 chunk 4 o3 1 4 o6 HiPC’12 Pune, India • ensuring load-balance: • enumerate & sort data objects according to Hilbert Space-Filling Curve, then pack sorted data objects into chunks • spatial index support: • Hilbert R-Tree deployed on master node • leaf nodes correspond to data chunks • initial filtering at master, tells workers which chunks to look

  9. Range Queries on Spatial Data Worker 1 Worker 2 Worker 3 Worker 4 chunk3 chunk4 chunk1 chunk2 step1 step1 step1 step1 k = 2 chunk2,1 chunk2,2 chunk3,1 chunk3,2 chunk4,1 chunk4,2 chunk1,1 chunk1,2 * rack-failure: same approach, but distribute sub-chunks to nodes in different rack HiPC’12 Pune, India • Fault-Tolerance Support – Sub-chunk Replication: step1:divide data chunks into k sub-chunks step2: distribute sub-chunks in round-robin fashion

  10. Range Queries on Spatial Data HiPC’12 Pune, India • Fault-Tolerance Support - Bookkeeping: • add a sub-leaf level to the bottom of Hilbert R-Tree • Hilbert R-Tree both as a filtering structure and failure management tool

  11. Aggregation Queries on Point Data partial result in worker 2 Y worker 2 worker 1 worker 3 worker 4 X M = 4 HiPC’12 Pune, India • nature of the task: • each data object is a point in 2D space • each query is defined with a dimension (X or Y), and aggregation function (SUM, AVG, …) • computational model: • master/worker model • divide space into M partitions • no indexing support • standard 2-phase algorithm: local and global aggregation

  12. Aggregation Queries on Point Data HiPC’12 Pune, India • reducing communication volume • initial partitioning scheme has a direct impact • have insights about data and query workload: P(X) and P(Y) = probability of aggregation along X and Y-axis |rx| and |ry| = range of X and Y coordinates • expected communication volume Vcomm defined as: • Goal: choose a partitioning scheme (cv and ch) that minimizes Vcomm

  13. Aggregation Queries on Point Data Y M’ = 4 ch’ = 2 cv’ = 2 a better distribution reduces comm. overhead rule-based selection:assign to nodes which share the same coordinate-range X HiPC’12 Pune, India • Fault-Tolerance Support – Sub-partition Replication: step1:divide each partition evenly into M’ sub-partitions step2: send each of M’ sub-partitions to a different worker node • Important questions: • how many sub-partitions (M’)? • how to divide a partition (cv’ and ch’) ? • where to send each sub-partition? (random vs. rule-based)

  14. Experiments HiPC’12 Pune, India • local cluster with nodes • two quad-core 2.53 GHz Xeon(R) processors with 12 GB RAM • entire system implemented in C by using MPI-library • range queries: • comparison with chunk replication scheme • 32 GB spatial data • 1000 queries are run, and aggregate time is reported • aggregation queries: • comparison with partition replication scheme • 24 GB point data • 64 nodes used, unless noted otherwise

  15. Experiments: Range Queries - Execution Times with No Replication and No Failures Optimal Chunk Size Selection Scalability (chunk size = 10000) HiPC’12 Pune, India

  16. Experiments: Range Queries • Execution Times under Failure Scenarios (64 workers in total) • k is the number of sub-chunks for a chunk Single-Machine Failure Rack Failure HiPC’12 Pune, India

  17. Experiments: Aggregation Queries Effect of Partitioning Scheme On Normal Execution Single-Machine Failure P(X) = P(Y) = 0.5, |rx| = |ry| = 10000 P(X) = P(Y) = 0.5, |rx| = |ry| = 100000 HiPC’12 Pune, India

  18. Conclusion HiPC’12 Pune, India • a fault-tolerant environment that can process • range queries on spatial data and aggregation queries on point data • but, proposed approaches can be extended for other type of queries and analysis tasks • high efficiency under normal execution • sub-chunk and sub-partition replications • preserve load-balance in presence of failures, and hence • outperform traditional replication schemes

  19. Thank you for listening … HiPC’12 Pune, India Questions

More Related