1 / 37

Performance and Energy Efficiency Evaluation of Big Data Systems

Performance and Energy Efficiency Evaluation of Big Data Systems. Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31. Goals of Big Data Systems. Tradeoff. Performance V.S. Energy Efficiency. Energy Efficiency. Performance. Performance. Energy Efficiency.

sora
Download Presentation

Performance and Energy Efficiency Evaluation of Big Data Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31

  2. Goals of Big Data Systems

  3. Tradeoff Performance V.S. Energy Efficiency Energy Efficiency Performance Performance Energy Efficiency Faster & More Powerful Greener & Cheaper • Evaluation • More servers • Bigger clusters • Powerful processors • Sophisticated processing algorithms • … • Lightweight servers • Efficient processors • Simpler processing algorithms • …

  4. Evaluation of Performance & Energy Efficiency Tradeoff • How to measure? • AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems • How to get balance? • The Implications from Benchmarking Three Big Data Systems

  5. Motivation If you can not measure it, you can not improve it. – Lord Kelvin PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.

  6. PUE & Its Variants

  7. Motivation • Scenario1 An Improved Data Classification Algorithm Does it contribute to greening the data centers? Run the Algorithms on Data Center Compare the PUEs PUE can not measure the effectiveness of any changes made upon the data center infrastructure! Data Management Researcher No Obvious Variations!

  8. Motivation • Scenario2 Give a budget plan of the data center energy consumption in the next year Estimate the data volume based on the business development How to estimate the energy increasement? Data Center Administrators PUE provides little reference information for data center planning according to data scale and application complexity

  9. Calculation Framework AxPUE PUE

  10. Definition - ApPUE • ApPUE (Application Performance Power Usage Effectiveness): a metric that measures the power usage effectiveness of IT equipments, specifically, how much of the power entering IT equipments is used to improve the application performance. • Computation Formulas: Data processing performance of applications The average rate of IT Equipment Energy consumed

  11. Definition - AoPUE • AoPUE (Application Overall Power Usage Effectiveness ): a metric that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance. • Computation Formulas: The average rate of Total Facility Energy Used

  12. Acquisition – Application Performance

  13. Acquisition – Benchmark • Requirements of Benchmarks • Provide representative workloads for big data applications • Provide a scalable data generation tool • BigDataBench • A big data benchmark suite open-sourced recently and publicly available • All the requirements are well fullfilled

  14. Experiment Overview • Testbed • Data center of 18 racks,362 servers • Sample 8 servers • Workloads • Two experiments • Different Applications • Different Implementation Algorithms

  15. Experiments on Different Applications 17.2 11.5 269.9 179.7 BigDataBench SVM Sort Grep Linpack

  16. Experiments on Different Algorithms • Two Implementations for Sort • Several reducers with random sampling partitioning • One reducer without partitioning

  17. Conclusions • We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers. • We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance. • The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.

  18. Evaluation of Performance & Energy Efficiency Tradeoff • How to measure? • AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers • How to get balance? • The Implications from Benchmarking Three Big Data Systems

  19. New Solutions ……

  20. Experimental Platforms • Xeon (Common processor) • Atom ( Low power processor) • Tilera (Many core processor) Brief Comparison Basic Information

  21. Benchmark Selection • BigDataBench • A big data benchmark suite from big data applications • Respective applications • An innovative data generation tool

  22. Metrics • Performance: Data processed per second (DPS) • Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ) Data Input Size DPS = Run Time Data Input Size DPJ = Energy Consumption

  23. General Observations

  24. General Observations • Data scale has a significant impact on the performance and energy efficiency of big data systems. • The performance and energy efficiency trends of different applications are diverse. Xeon Atom Tilera

  25. Xeon VS Atom – DPS

  26. Xeon VS Atom – DPJ

  27. Xeon VS Atom – DPS & DPJ • Xeon is more powerful than Atom on processing capacity. • Atom is more energy –saving than Xeon when dealing with simple computation logic applications.

  28. Xeon VS Atom -- Summary • Xeon is more powerful than Atom on processing capacity. • Atom is energy conservation than Xeon when dealing with applications with simple computation logic. • Atom doesn’t show energy advantage when dealing with complex applications.

  29. Xeon VS Tilera – DPS

  30. Xeon VS Tilera – DPJ

  31. Xeon VS Tilera – DPS & DPJ • Xeon is more powerful than Tilera on processing capacity • Tilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applications • Tilera don’t show energy advantage when dealing with complex applications

  32. Xeon VS Tilera The DPS of Atom The DPS of Tilera The DPS of Xeon

  33. Xeon VS Tilera • Tilera is more suitable to process I/O intensive applications The DPS of Tilera

  34. Xeon VS Tilera -- Summary • Xeon is more powerful than Tilera on processing capacity. • Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications. • Tilera don’t show energy advantage when dealing with complex applications. • Tilera is more suitable to process I/O intensive applications.

  35. Implications • The performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads. • The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.

  36. Implications Cont. • Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications. • Atom and Tilera show energy consumption advantage when dealing with light scale-out applications. • Tilera exerts energy advantage on processing I/Ointensive application.

More Related