270 likes | 379 Views
Application-driven Energy-efficient Architecture Explorations for Big Data. Authors : Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of Computing Technology, Chinese Academy of Sciences) Reviewed by- Siddharth Bhave (University of Washington, Tacoma). Big Data.
E N D
Application-driven Energy-efficient Architecture Explorations for Big Data Authors: XiaoyanGu RuiHou Ke Zhang Lixin Zhang Weiping Wang (Institute of Computing Technology, Chinese Academy of Sciences) Reviewed by- SiddharthBhave (University of Washington, Tacoma)
Big Data • What is Big Data? • Problems with Big data • Energy Consumption • Velocity (Operation latency and throughput) • Volume (storing capacity) • Variety • Managing Big Data Problems • Storage Technologies • Partitioning • Multithreading • Parallel Processing • Efficient Architecture • Hadoop, Map Reduce, MAHOUT • Find bottle neck
Introduction • Big data management at architecture level • Two architecture systems • Xeon-based cluster • Atom Based (micro-server) Cluster • Comparison Based on: - • Energy consumption • Execution time
Motivation • Ever increasing data. • Energy and Time tradeoff in Xeon and Atom based clusters. • Bottleneck by the processes of compression/decompression • Stateless data processing
Mastiff • Mastiff - Targeted application for performance analysis • Big data processing engine • Columnar store policy
Methodology • TPC-H test benchmark of queries and concurrent data • 1 TB of verification data • 2 cases - data load and data query • Fluke NORMA 4000 • Average cases and median results are reported
Power and Performance Evaluation • Take 3 cases for time and energy consumption • 31 nodes – Atom Cluster (1 master node) • 31 nodes – Xeon Cluster (1 master node) • 16 nodes – Xeon Cluster (1 master node)
Power and Performance Evaluation (cont’d) Energy consumption between 30-node Atom Cluster and 30-node Xeon Cluster
Power and Performance Evaluation (cont’d) Energy consumption between 30-node Atom Cluster and 15-node Xeon Cluster
Power and Performance Evaluation (cont’d) Time Breakdown in Map Phase
Power and Performance Evaluation (cont’d) Time Breakdown in Reduce phase
Findings • Atom platform more power efficient • Data compression and decompression occupies significant percentage. • Compression and decompression can be done in software pipeline fashion i.e. with multiple interleave
Propositions • Heterogeneous architecture • Accelerators to perform data compression/decompression • Multiple interleaved compression/decompression
Strengths • A much needed innovative concept • Organized well • Detailed description of energy and time investigation • Already implemented propositions
Weaknesses • Not enough power meters to monitor all nodes • 2 assumptions • Power of every network router is evenly counted towards nodes • Energy consumption of each node is similar • Results are generalized by Hadoop even if they might not be true for every application. • Vague propsitions implementation
FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen Jason Franklin Michael Kaminsky Amar Phanishayee Lawrence Tan Vijay Vasudevan (Carnegie Mellon University)
Introduction • High performance, energy efficient system for storage • Large number of small low-performance (hence wimpy) nodes with moderate amounts of local storage • 2 parts: FAWN-DS (data store) and FAWN-KV (key value) • Motivation • Traditional architecture consumes too much power • I/O bottleneck due to current storage inabilities
Features • Pairs of low powered embedded nodes with flash storage • FAWN-DS is the backend that consists of the large number of nodes • Each node has some RAM and flash • FAWN-KV is a consistent, replicated, highly available and high performance key value storage system
Efficient Data Streaming with On-chip Accelerators: Opportunities and Chanllenges Authors: RuiHou Lixin Zhang Michael C. Huang Kun Wang Hubertus Franke Yi Ge Xiaotao Chang (University of Rochester)
Motivation • Transistor density increasing day by day • Many cores are integrated in a single die • Advantage of on-chip accelerator instead of using it as PCI
Features • 3 types of accelerators • Crypto accelerators • Decompression accelerators • Network offload accelerator • Some common characteristics of data stream in the 3 accelerators • Optimize the power and performance of the accelerators.