170 likes | 405 Views
The Hadoop Distributed Filesystem : Balancing Portability and Performance. Jeffrey Shafer, Scott Rixner and Alan L. Cox Presented by: Bhavani Sankar Ikkurthi CS 775, Spring 2011, Old Dominion University. Bottlenecks. Software Architectural Bottlenecks Portability Limitations
E N D
The Hadoop Distributed Filesystem: Balancing Portability and Performance Jeffrey Shafer, Scott Rixner and Alan L. Cox Presented by: Bhavani Sankar Ikkurthi CS 775, Spring 2011, Old Dominion University
Bottlenecks • Software Architectural Bottlenecks • Portability Limitations • Portability Assumptions
Components • MapReduce Engine • Hadoop Distributed File System • HDFS Replication
Evaluation • Experimental Setup • 5 – node • 4 nodes – Computation and Storage • 1 node – Scheduler and Storage Manager • 2-processor Opteron, 2.4GHz, 4GB RAM • UFS2 Filesystem, 16kB block size • Hadoop replication disabled
Solutions • Application Disk Scheduling
Solutions • Non-portable • OS Hints • Filesystem Selection • Cache Bypass • Local Filesystem Elimination
Conclusions • Interactions between Hadoop and storage are characterized • Bottlenecks found in HDFS implementation • Solutions are proposed to escape bottlenecks • Maintain Hadoop portability whenever possible
Thank You Questions?