Optimizing Reduction Computations In a Distributed Environment

Optimizing Reduction Computations In a Distributed Environment Tahsin Kurc, Feng Lee, Gagan Agrawal, Umit Catalyurek, Renato Ferreira, Joel Saltz Biomedical Informatics Department and Computer and Information Science Department Ohio State University

Roadmap • Data Intensive Computation • Generalized Reduction Operations • Range Aggregation Queries • Runtime Environment: DataCutter • Execution Strategies • Replicated Filter State • Partitioned Filter State • Hybrid • Experimental Results • Conclusions and Future Work

Data Intensive, Distributed Computing • Large data collections • Multi-resolution, multi-dimensional • data elements correspond to points in multi-dimensional attribute space • medical images, satellite data, hydrodynamics data, oil reservoir simulation, seismic data, etc. • Data exploration and analysis • subsets of one or more datasets • Data subset is defined by a multi-dimensional window • A spatial index can be used to speed up data lookup (e.g., R-tree, quad-tree, etc.) • A data product is generated by processing the data subset; generally results in data reduction • Such queries are referred to as range aggregation queries • Data generated and processed in a distributed environment

accumulator per element operations defined by some spatial relationship • could be expensive • order independent Generalized Reduction Operations // Selection = range query DU = Output; DI = Select(Input, R); for (ue in DU) { get ue; ae = Initialize(ue); A = ae;} // Reduction for (ie in DI) { read ie; it = Transform(ie); SA = Map(it, A); for (ae in SA) { ae = Aggregate(it,ae); } } // Output for (ae in A) { ue = Finalize(ae); output ue; }

Many Applications Water Contamination Studies Satellite Data Processing Pathology Reservoir Simulation and Seismic Data Analysis Visualization

Generalized Reductions result data elements Dataset is partitioned into data chunks Chunks are distributed across disks intermediate data elements (accumulator elements) reduction function source data elements

Runtime Environment: DataCutter • Component Framework for Combined Task/Data Parallelism • User defines sequence of pipelined components (filters and filter groups) • User directive tells preprocessor/runtime system to generate and instantiate copies of filters • Stream based communication • Multiple filter groups can be active simultaneously • Flow control between transparent filter copies • Replicated individual filters • Transparent: single stream illusion

Aggregate Merge Output Read Map Transform Aggregate Merge Output Read Map Transform DataCutter-based Implementation • Implement each operation as a filter • Components can be merged • Order of some components can be changed

Roadmap • Data Intensive Computation • Generalized Reduction Operations • Range Aggregation Queries • Runtime Environment: DataCutter • Execution Strategies • Replicated Filter State • Partitioned Filter State • Hybrid • Experimental Results • Conclusions and Future Work

Execution StrategiesPartitioned Filter State • Partition the accumulator across computing nodes • Retrieve and send the input data elements (data chunks) to corresponding partitions • Parallel computation of accumulator pieces • Good use of aggregate memory space • Load imbalance and communication overhead

Partitioned Filter State • Partition in one dimension • Jagged 2D partition • Recursive Bisection • Graph/Hypergraph partitioning

Execution StrategiesReplicated Filter State • Replicate the accumulator on each computing node • Retrieve and do local aggregation • Retrieve and demand-driven assignment of data to nodes • Merge the partial results • Merge overheads • Not good use of distributed memory. Accumulator can be very large

Replicated Filter StateMerge Phase Partitioned Merge Hierarchical Merge

Execution StrategiesHybrid • A combination of the two extreme cases • Partition the accumulator • Replicate some of the sub-accumulator regions • More adaptable to load and environment

Several Ways to Hybrid • Hybrid is the most flexible among the strategies; many ways to implement it. • How to combine partitioning and replication • How to place partitioned and replicated pieces • Partition into N, replicate each piece by M • NxM = number of processors • Choice of N • N is small (approaching RFS) – e.g., if input is much larger than output • N is large (approaching PFS) – e.g., if input is comparable to output • Placement of replicated pieces • Assign pieces to nodes to minimize input communication (more suitable for machines with low communication-computation ratio) • Assign pieces to nodes to achieve load balance (more suitable for configurations with low computing power)

Several Ways to Hybrid • Partition nodes into groups • Each group has sufficient aggregate memory space for the accumulator • Replicate accumulator to each group • Partition within a group • Partition into N • Adaptively replicate the most loaded pieces • Assign them to least loaded processors

Experimental Results • Hybrid • Partition into N, replicate by M • N is small, if input is large • N is larger, if input is smaller than or comparable to output • N is typically 2 in our case • Place replicated pieces • Sort the storage nodes based on how much data they retrieve for each piece of N pieces • Place the pieces in the sorted order • Goals • The performance of the three strategies as the volume of input data and the size of accumulator are scaled proportionately. • The relative performance of the techniques in a distributed environment, where data is hosted on one set of machines, and another cluster is used for processing the data. • The scalability of the techniques in a cluster environment.

Experimental Results • Application Emulators • Satellite data processing • Skewed mapping of input to output • More compute intensive • Virtual Microscope • Data intensive • Regular mapping • Water Contamination Studies • More balanced • Regular mapping • Hardware Configuration • Pentium III cluster • 16 nodes • 300 GB disk space per node • 512 MB memory per node

Datasets and Queries

Scalability Results: Titan Small Query/Large Acc. Large Query/Small Acc.

Scalability Results: VM Small Query/Large Acc. Large Query/Small Acc.

Distributed Execution: Titan Small Query/Large Acc. Large Query/Small Acc.

Distributed Execution: VM Small Query/Large Acc. Large Query/Small Acc.

Scaling Query and Accumulator Proportionately

Conclusions • Performance of strategies depends on application and platform characteristics • The runtime environment should support multiple strategies • Replicated strategy • Sufficient memory is available • Aggregation operation is expensive to offset the cost of merging • Hybrid is attractive • Best performance or close to the best one • it is more flexible • May not be possible to estimate the relative performance of different strategies • Automated selection of strategies and different hybrid approaches • Dynamic adaptation to the characteristics of the environment • Dynamic replication or partitioning of accumulator as the data is processed.

End of Talk

Optimizing Reduction Computations In a Distributed Environment

Optimizing Reduction Computations In a Distributed Environment

Presentation Transcript

MANAGING RISK IN A DISTRIBUTED ENVIRONMENT

Optimizing SQL Server Performance in a Virtual Environment

Simulation in a Distributed Computing Environment

Geant4 in a Distributed Computing Environment

Record Linkage in a Distributed Environment

Optimizing Stencil Computations March 18, 2013

Record Linkage in a Distributed Environment

FOIP IN A DISTANCE AND DISTRIBUTED ENVIRONMENT

Distributed Secure Multiparty Computations in Hostile Environments

Protein Folding Landscapes in a Distributed Environment

A Tailorable Distributed Programming Environment

“Distributed Planning in a Mixed-Initiative Environment”

Optimizing Parallel Reduction in CUDA

Optimizing Distributed Learning Models:

Hardening Distributed Volunteer Computations

Security in a Distributed Resource Environment

Optimizing Parallel Reduction in CUDA

Distributed Computations MapReduce/Dryad

Security in a Distributed Resource Environment

Distributed Computations MapReduce