1 / 20

CUDA Performance Study on Hadoop MapReduce Clusters

CUDA Performance Study on Hadoop MapReduce Clusters. CSE 930 Advanced Computer Architecture @ Fall 2010. Chen He Peng Du che@cse.unl.edu pdu@cse.unl.edu University of Nebraska-Lincoln. Overview. Introduction Methodology Evaluation Conclusions Future work. Introduction.

fadhila
Download Presentation

CUDA Performance Study on Hadoop MapReduce Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CUDA Performance Study on Hadoop MapReduce Clusters CSE 930 Advanced Computer Architecture @ Fall 2010 Chen He Peng Du che@cse.unl.edupdu@cse.unl.edu University of Nebraska-Lincoln

  2. Overview • Introduction • Methodology • Evaluation • Conclusions • Future work

  3. Introduction • Hadoop MapReduce

  4. Introduction CPU+GPU Architecture

  5. Introduction • Questions • Can we introduce CUDA into Hadoop MapReduce Clusters? • Mechanism and implementation • Is this reasonable? • Effects and Costs

  6. Methodology • Question-1:Can we introduce CUDA into Hadoop ?

  7. Methodology • Test cases • SDK programs • Data intensive: Matrix Multiplication • Computation intensive: Monte Carlo • MDMR • Pure Java program • Introduce Jcuda

  8. Methodology • Port CUDA programs onto Hadoop • GPU (CUDA-C) vs CPU (C) • Approach • MapRed (processHadoopData & cudaCompute) • Main (Hadoop Pipes) • Scripts (runbase.sh, run-<prog>-CPU/GPU.sh) • Input data generators

  9. Methodology Hadoop DFS .c .c data Input Generator generates void generate(..) Hadoop-enabled MonteCarlo CUDA MonteCarlo MonteCarlo.cpp void processHadoopData(..) void cudaCompute(..) extracted .c .c .c extracted .cu MapRed.cpp .cu .cu class Mapper class Reducer

  10. Methodology • MDMR • Time Complexity by using CPU • We can simply employ GPU to parallel the n-squre portion to reduce the time complexity to linear

  11. Evaluation Environment Head: 2xAMD 2.2GHz, 4GB DDR400 RAM, 800GB HD 3 PCs (AMD 2.3G CPU, 2G DDR2-667 RAM, 400GB HD, 1Gbps Ethernet) XFX 9400GT 64bit 512MB DDR3 ($20) CUDA 3.2 Toolkit Hadoop 0.20.3 ServerTech CWG-CDU power distribution unit (for the power consumption monitoring) Factors Speedup Power consumption Cost

  12. Evaluation Matrix Multiplication (Execution time)

  13. Evaluation Matrix Multiplication (Power consumption)

  14. Evaluation

  15. Evaluation

  16. Evaluation • MDMR • Execution time

  17. Evaluation • MDMR • Power consumption

  18. Conclusions • Introduced GPU into MapReduce cluster and obtained up to 20 times speedup. • Reduced up to 19/20 power consumption with the current preliminary solution and work load. • With as little as $20 per node, compared with upgrading CPUs and adding more nodes, deploying GPU on Hadoop has high cost-to-benefit ratio. • Provided practical implementations for people wanting to construct MapReduce clusters with GPUs.

  19. Future Work • Port more CUDA programs onto Hadoop. • Incorporate reducers into the experiments • Support heterogeneous clusters which mixed GPU-nodes and non-GPU nodes.

  20. Thank You!Questions?

More Related