140 likes | 238 Views
6.1 Characteristics of a DFS. Yichuan Wang ywang38@student.gsu.edu. Files. Computing system Persistent data objects File system. DFS. Transparency Directory service (name service) Caching and replication Access control and security. DFS Characteristics. Dispersion[Randy chow,1997]
E N D
6.1 Characteristics of a DFS Yichuan Wang ywang38@student.gsu.edu
Files • Computing system • Persistent data objects • File system
DFS • Transparency • Directory service (name service) • Caching and replication • Access control and security
DFS Characteristics • Dispersion[Randy chow,1997] • Multiplicity
Transparency[3] • Access transparency: Client programs should be unaware of the the distribution of files. • Location transparency: Client program should see a uniform namespace. Files should be able to be relocated without changing their path name. • Mobility transparency: Neither client programs nor system admin program tables in the client nodes should be changed when files are moved either automatically or by the system admin. • Performance transparency: Client programs should continue to perform well on load within a specified range. • Scaling transparency: increase in size of storage and network size should be transparent.
Name Service • a name space -- collection of names • name resolution -- mapping a name to an object • same or different view of a directory hierarchy • 3 traditional ways to name files in a distributed environment • concatenate the host name to the names of files stored on that host:system-wide uniqueness guaranteed, simple to located a file; however, not network transparent, not location independent, e.g., /machine/usr/foo • mount remote directories onto local directories:once mounted, files can be referenced in a location-transparent manner • provide a single global directory:requires a unique file name for every file, location independent,cannot encompass heterogeneous environments and wide geographical areas
Cache[2] • Four places to store files • server’s disk: slow performance • server caching: in main memory • cache management issue, how much to cache, replacement strategy • still slow due to network delay • Used in high-performance web-search engine servers • client caching in main memory • can be used by diskless workstation • faster to access from main memory than disk • compete with the virtual memory system for physical memory space • Three options (Fig. 13-10) • client-cache on a local disk • large files can be cached • the virtual memory management is simpler • a workstation can function even when it is disconnected from the network
Cache • reduces remote accesses : reduces network traffic and server load • total network overhead is lower for big chunks of data (caching) than a series of responses to specific requests. • disk access can be optimized better for large requests than random disk blocks • cache-consistency problem is the major drawback. If there are frequent writes, overhead due to the consistency problem is significant. • OS is simpler for remote service.
NFS • The Network File System (NFS) was developed to allow machines to mount a disk partition on a remote machine as if it were on a local hard drive. This allows for fast, seamless sharing of files across a network
Client computer Server computer Application Application program program UNIX system calls UNIX kernel UNIX kernel Virtual file system Virtual file system Local Remote UNIX UNIX Other file system NFS NFS file file client server system system NFS protocol NFS
Google FS[Sanjay G etl.2003] • The system is built from many inexpensive commodity components that often fail. • Big file optimization • Large, sequential writes, seldom modifications • Multiple concurrent appending • High sustained bandwidth is more important than low latency
GFS architecture • A GFS cluster consists of a single master and multiple chunkservers and is accessed by multiple clients • Master stores all the meta data: namespace, access control information, the mapping from files to chunks, and the current locations of chunks.
GFS architecture • Commodity Linux machine • Chunks as Linux files • Three replica
Reference • [1]Randy chow,1997,Distributed operating systems & Algorithms • [2]http://www.cse.buffalo.edu/gridforce/fall2004/DistributedFileSystemSept29.ppt • [3]http://www.cis.upenn.edu/~lee/00cse380/lectures/ln14-dfs.ppt • [4]Sanjay G etl.2003 The Google File System