200 likes | 237 Views
Explore basic concepts, characteristics, and future of Hadoop Distributed File System (DFS), focusing on fault tolerance, transparency, replication, and security. Current project outlines its setup, naming schemes, and synchronization for efficient data management. Future work emphasizes robustness and data sharing support. Relevant references provided for in-depth understanding.
E N D
Distributed File System By Manshu Zhang
Outline • Basic Concepts • Current project • Hadoop Distributed File System • Future work • Reference
DFS A distributed implementation of the classical time sharing model of a file system, where multiple users share files and storage resources.
Key Characteristics of DFS • Dispersion • Clients and files • Multiplicity • Clients and files
Primary issues of DFS Naming and Transparency Fault Tolerance
Naming Naming – mapping between logical and physical objects. Multilevel mapping. Transparent replicas and location
Naming Schemes — Three Main Approaches • Host name + local name • guarantees a unique system wide name. • Mount remote directories to local directories • once mounted, files can be referenced in a location-transparent manner • Total integration of the component file systems. • A single global name structure • If a server is unavailable, some arbitrary set of directories on on different machines also becomes unavailable
Transparency(1) • Login Transparency:User can log in at any host with uniform login procedure and perceive a uniform view of the file system. • Access Transparency: Client process on a hots has uniform mechanism to access all files in system regardeless of files are on local/remote host. • Location Transparency: The names of the files do not reveal their physical location.
Transparency(2) Concurrency Transparency: An update to a file should not have effect on the correct execution of other process that is concurrently sharing a file. Replication Transparency: Files may be replicated to provide redundancy for availability and also to permit concurrent access for efficiency.
Fault Tolerance • Stateful Vs. Stateless • Maintain information on client • File Replication
Distinctions Between Stateful &Stateless Service • Failure Recovery. • A stateful server loses all its volatile state in a crash. • With stateless server, the effects of server failure and recovery are almost unnoticeable.
File Replication Several copies of a file's contents at different locations enable multiple servers to share the load of providing the service Naming scheme maps a replicated file name to a particular replica. Updates
Current Project HDFS: Hadoop Distributed File System Distributed parallel fault tolerant file system. It is designed to reliably store very large files across machines in a large cluster. Efficient, reliable, and open source
Naming: central metadata server Synchronization: write-once-read-many, give locks on objects to clients, using leases Consistency and replication: server side replication, asynchronous replication, checksum Fault tolerance: failure as norm Security: no dedicated security mechanism
Future Work Robustness of data sharing model The preceding section, architecture, naming, synchronization, availability, heterogeneity and support for databases Security
Reference [1] Thanh, T.D.; Mohan, S.; Choi, E.; SangBum Kim; Pilsung Kim. 2008Networked Computing and Advanced Information Management. “A Taxonomy and Survey on Distributed File Systems” [2] Randy chow,1997,Distributed operating systems & Algorithms [3] Eliezer Levy, Abraham Silberschatz. December 1990 Computing Surveys (CSUR) , Volume 22 Issue 4. ”Distributed file systems: concepts and examples”. [4]http://hadoop.apache.org/common/docs/current/hdfs_design.html#Introduction [5]http://www.snia.org/events/wintersymp2009/cloud/dhruba_hadoop_snia.pdf
[6]http://en.wikipedia.org/wiki/List_of_file_systems#Distributed_file_systems[6]http://en.wikipedia.org/wiki/List_of_file_systems#Distributed_file_systems [7]http://en.wikipedia.org/wiki/Hadoop#Hadoop_Distributed_File_System [8]http://www.cs.gsu.edu/~cscyqz/courses/aos/slides08/ch6.1-Fall08.pptx