350 likes | 660 Views
Distributed File System. Yanjun Zhao. DFS . A network file system where a single file system can be distributed across several physical computers allows administrators to group shared folders located on different servers by transparently connecting them to one or more DFS namespaces.
E N D
Distributed File System Yanjun Zhao
DFS • A network file system where a single file system can be distributed across several physical computers • allows administrators to group shared folders located on different servers by transparently connecting them to one or more DFS namespaces
Characteristics of a DFS • Network transparency: same access operation as local files • Location transparency: file name should not reveal its location • Location independence: file name should not be changed when its physical location changes • User mobility: access to file from anywhere • Fault tolerance • Scalability • File mobility: move files from one place to another in a running system
Files & File Systems • Files are named data objects. Files hold structured data that are used by programs but that are not part of the programs themselves. • File system is responsible for the naming, creation, deletion, retrieval, modification, and protection of a file in the system. • Logical components of a file for users. File Name File Attributes Data units
Example • UNIX • Files are streams of characters for application programs and sequences of logical fixed size blocks for file system. • Both sequential and direct access methods are supported. Other access methods can be built on top of the flat file structures.
Directory Service • Directories are files that contain names and addresses of other files and subdirectories. • Mapping and locating • Search for a file • Create a file • Delete a file • List a directory • Rename a file • Traverse the file system
Authorization Service • File access must be regulated to ensure security • Types of access • Read • Write • Execute • Append • Delete • List
File Service – Basic Operations • Delete • Search the directory • Release all file space • Truncate • Reset the file to length zero • Open(Fi) • Search the directory structure • Move the content of the directory entry to memory • Close(Fi) • move the content in memory to directory structure on disk • Get/set file attributes • Create • Allocate space • Make an entry in the directory • Write • Search the directory • Write is to take place at the location of the write pointer • Read • Search the directory • Read is to take place at the location of the read pointer • Reposition within file – file seek • Set the current file pointer to a given value
System Service • System services are a FS’s interface to the hardware and are transparent to users of FS • Mapping of logical to physical block addresses • Interfacing to services at the device level for file space allocation/de-allocation • Actual read/write file operations • Caching for performance enhancement • Replicating for reliability improvement
File Mounting • Attach a remote named file system to the client’s file system hierarchy at the position pointed to by a path name • A mounting point is usually a leaf of the directory tree that contains only an empty subdirectory • Once files are mounted, they are accessed by using the concatenated logical path names without referencing either the remote hosts or local devices • Location transparency • The linked information (mount table) is kept until they are unmounted
File Mounting • Different clients may perceive a different FS view • To achieve a global FS view – SA enforces mounting rules • Export: a file server restricts/allows the mounting of all or parts of its file system to a predefined set of hosts • The information is kept in the server’s export file • File system mounting: • Explicit mounting: clients make explicit mounting system calls whenever one is desired • Boot mounting: a set of file servers is prescribed and all mountings are performed the client’s boot time • Auto-mounting: mounting of the servers is implicitly done on demand when a file is first opened by a client
Server Registration • The mounting protocol is not transparent – the initial mounting requires knowledge of the location of file servers • Server registration • File servers register their services, and clients consult with the registration server before mounting • Clients broadcast mounting requests, and file servers respond to client’s requests
Stateful&Stateless File Servers • State information • Opened files and their clients • File descriptors and file handles • Current file position pointers • Mounting information • Lock status • Session keys • Cache or buffer
Stateful& Stateless File Servers • Sateful : a file server maintains internally some of the state information • Stateless : a file server maintains none at all. • Stateful file Server : file servers maintain state information about clients between requests • Stateless file Server : when a client sends a request to a server, the server carries out the request, sends the reply, and then remove from its internal tables all information about the request • Between requests, no client-specific information is kept on the server • Each request must be self-contained: full file name and offset…
File Sharing • Overlapping access: multiple copies of the same file • Space multiplexing of the file • Cache or replication • Coherency control: managing accesses to the replicas, to provide a coherent view of the shared file • Desirable to guarantee the atomicity of updates (to all copies) • Interleaving access: multiple granularities of data access operations • Time multiplexing of the file • Simple read/write, Transaction, Session • Concurrency control: how to prevent one execution sequence from interfering with the others when they are interleaved and how to avoid inconsistent or erroneous results
Space Multiplexing • Remote access: no file data is kept in the client machine. Each access request is transmitted directly to the remote file server through the underlying network. • Cache access: a small part of the file data is maintained in a local cache. A write operation or cache miss results a remote access and update of the cache • Download/upload access: the entire file is downloaded for local accesses. A remote access or upload is performed when updating the remote file
MADFS: The Mobile Agent-based Distributed Network File System
The Disadvantages of Conventional Distributed File System • File storage protocol and cache management mechanism are not suitable for WAN • Flexibility is poor • Availability is poor
MADFS • Reduce the overhead of network transfer and cache management inherent to the distribution of a distributed files system in WAN. • Organizes hosts into a hierarchical structure, and uses mobile agents as the underlying facility for transmission, communication and synchronization. • Uses the Hierarchical and Convergent Cache Coherency Mechanism (HCCM) to minimize the network communication and server overhead of cache management.
WAN & LAN • LAN : wide bandwidth and low transfer delay, • WAN: low bandwidth and high transfer delay[
The Architecture of MADFS • MADFS is divided into a number of domains in which are connected through high speed LAN and linked to each other through low speed WAN. • Each domain is composed of number of hosts. • In MADFS, a domain acts as the major domain and is in charge of the all others domains in MADFS. • Every server MADFS run the environment for mobile agent and the whole MADFS is a large platform for mobile agent.
The Advantages in the Hierarchical Architecture of MADFS • Share all the overloads of communication and cache management over all DMA (Domain Manage Agent) and avoid the single central server to be the bottleneck of system. • The communication in domain can gain a better performance by using the protocol designed for LAN, and the security operation can be properly reduced
The Advantage of The Two Layer Lock Request Mechanism • Reducing the network communication for managing file lock and duplicating the file buffer, particularly the communication in WAN • Converging can reduce effectively the overload of the maintaining the lockstate
Reference • Jun Lu; Bin Du; Yi Zhu; DaiWei Li. MADFS: The Mobile Agent-Based Distributed Network File System Intelligent Systems, 2009. GCIS '09. WRI Global Congress on Volume 1, 19-21 May 2009 Page(s):68 - 74