160 likes | 307 Views
The Hadoop Distributed File System. PaoMin Wu University at Buffalo. Namenode stores matadata of the system keeps all namespace in RAM Datanode block replica stores application data 3. HDFS-Client User applications access the file system using the HDFS client. ARCHITECTURE.
E N D
The Hadoop Distributed File System PaoMin Wu University at Buffalo
Namenode • stores matadata of the system • keeps all namespace in RAM • Datanode • block replica • stores application data • 3. HDFS-Client • User applications access the file system using the HDFS • client ARCHITECTURE
4. Image and Journal • Namespace image = file system metadata • Peresistent record of image = checkpoint • CheckpointNode (NameNode) • Protects file system metadata • 6. BackupNode (NameNode) • Capable of creating periodic checkpoints ARCHITECTURE
Problem: NameNode contains all important information Solution: Allow multiple namespaces(and NameNodes) to share the physical storage within a cluster Future Work
MapReduce: Simplied Data Processing on Large Clusters PaoMin Wu University at Buffalo
key/value pair • execution across a set of machines • handling machine failures • managing the required inter-machine communication • runs on a large cluster • powerful interface • automatic parallelization • distribution of large-scale computations Introduction
Map, written by the user, takes an input pair and produces a set of intermediate key/value pairs. The Reduce function, also written by the user, accepts an intermediate key and a set of values for that key. The intermediate values are supplied to the user's reduce function via an iterator. Programming Model
Restricting the programming model is beneficial Network bandwidth is a scarce resource Redundant execution can help Conclusions
The Hadoop Distributed File System Konstantin Shvachko, HairongKuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com MapReduce: Simplied Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat jeff@google.com, sanjay@google.com Google, Inc. References: