610 likes | 618 Views
This article discusses the benefits of using persistent storage in distributed file systems, including reduced need for local disk storage, cost savings, and easier implementation of other services. It also covers different types of storage systems and consistency between copies of data.
E N D
Topics Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
What is the advantages of the persistent storage at a few servers? • Reduce the need for local disk storage • Enables economies to be made in the management and archiving of the persistent data owned by an organization • Other services ( name service, user authentication service and print service) can be more easily implemented.
Introduction Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM 1 File system UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (Ch. 18) Remote objects (RMI/ORB) CORBA 1 1 Persistent object store CORBA Persistent Object Service Peer-to-peer storage system OceanStore(Ch. 10) Types of storage system Figure 1. Storage systems and their properties • Types of consistency between copies: 1 - strict one-copy consistency • √ - approximate consistency • X - no automatic consistency
A persistent object store is a computer storage system that records and retrieves complete objects, or provides the illusion of doing so. Simple examples store the serialized object in binary format. Distributed shared memory (DSM) is a form of memory architecture where the (physically separate) memories can be addressed as one (logically shared) address space. It provides an emulation of a shared memory by the replication of memory pages or segments at each host.
Consistency • Consistency indicates whether mechanisms exist for the maintenance of Consistency between multiple copies of data when updates occur. • Cashing was first applied to main memory and non-distributed file system, for which consistency is strict ( denoted by a ‘1’ because programs cannot observe any discrepancies between copies after updates. • When distributed replicas are used, strict Consistency is more difficult to achieve. Distributed file systems such as Sun NFS adopt specific consistency mechanisms to maintain an approximation to strict Consistency
Cont. • The consistency between the copies stored at the web proxies and client caches and the original server is only maintained by explicit user action. • Clients are not notified when a page stored at the original server updated; they must perform explicit checks to keep their local copies up to date .
File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list E.g. for UNIX: rw-rw-r-- Distributed file systems File attribute record structure updated by system: updated by owner: 11
File system are designed to store and manage large numbers of files with facilities for creating, naming and deleting files. The name of files is supporting by the use of directories. • Directories: is a file that provides a mapping from text names to internal file identifiers. • The term Matadata: is often used to refer to all of the extra information stored by a file system that is needed for the management of files. And its include: • File attributes • Directories • All other persistent information used by file system
Figure 8.2shows a typical layered module structure for the implementation of a non-distributed file system in a conventional operating system. • Each layer depends only the layered below it . • The implementation of a distributed file system requires all of the components there, with additional components to deal with client-server communication and with the distributed naming and location of files.
Figure 2.File system modules Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Figure 4. UNIX file system operations filedes = open(name, mode) Opens an existing file with the given name. filedes = creat(name, mode) Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. status = close(filedes) Closes the open file filedes. count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. count = write(filedes, buffer, n) Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos = lseek(filedes, offset, Moves the read-write pointer to offset (relative or absolute, whence) depending on whence). status = unlink(name) Removes the file name from the directory structure. If the file has no other names, it is deleted. status = link(name1, name2) Adds a new name (name2) for a file (name1). status = stat(name, buffer) Gets the file attributes for file name into buffer. Introduction Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Distributed File system requirements Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
5- Fault tolerance: The central role of the file service in distributed systems makes it essential that the service continue to operate in the face of client and server failures. 6- Consistency: 7- Security 8- Efficiency
File Service Architecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Client computer Server computer Directory service Application Application program program Flat file service Client module File Service Architecture Figure 5.File service architecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
File Service Architecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
File Service Architecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
File Service Architecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
File Service Architecture Read(FileId, i, n) -> Data if 1≤i≤Length(File): Reads a sequence of up to n items -throws BadPosition from a file starting at item i and returns it in Data. Write(FileId, i, Data) if 1≤i≤Length(File)+1: Write a sequence of Data to a -throws BadPosition file, starting at item i, extending the file if necessary. Create() -> FileId Creates a new file of length0 and delivers a UFID for it. Delete(FileId) Removes the file from the file store. GetAttributes(FileId) -> Attr Returns the file attributes for the file. SetAttributes(FileId, Attr) Sets the file attributes (only those attributes that are not shaded in Figure 3.) Figure 6.Flat file service operations Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Comparison with UNIX • Our interface and the Unix file system primitives are functionally equivalent. • It is a Simple matter to construct a client module that emulates the UNIX system calls.
Cont. Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Cont. Lookup(Dir, Name) -> FileIdLocates the text name in the directory and -throws NotFound returns the relevant UFID. If Name is not in the directory, throws an exception. AddName(Dir, Name, File) If Name is not in the directory, adds(Name,File) -throws NameDuplicate to the directory and updates the file’s attribute record. If Name is already in the directory: throws an exception. UnName(Dir, Name) If Name is in the directory, the entry containing Name is removed from the directory. If Name is not in the directory: throws an exception. GetNames(Dir, Pattern) -> NameSeqReturns all the text names in the directory that match the regular expression Pattern. Figure 7.Directory service operations Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Cont. Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
To construct a globally unique ID we use some unique attribute of the machine on which it is created, e.g. IP number, even though the file group may move subsequently. File Group ID: 32 bits 16 bits IP address date Cont. Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
DFS: Case Studies Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
UNIX system calls UNIX kernel Operationson remote files Operations on local files Other file system NFS protocol (remote operations) NFS architecture Client computer Server computer Application Application program program UNIX kernel Virtual file system Virtual file system UNIX UNIX NFS NFS file file client server system system Figure 8.NFSarchitecture Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005 *
we use the term server to refer to a machine that provides resources to the network; a client is a machine that accesses resources over the network; a user is a person “logged in” at a client; an application is a program that executes on a client; and a workstation is a client machine that typically supports one user at a time. • Requests referring to files in a remote file system are translated by the client module to NFS protocol operations and then passed to the NFS server module at the computer holding the relevant file system. • Requests NFS protocol NFS server module • The NFS client and server modules communicate using remote procedure calling. • The RPC interface to the NFS server is open → any process can send requests to an NFS server ; if requests is valid and they include valid user credentials, they will be acted upon
Virtual file system • It is clear that NFS provides access transparency: user programs can issue file operations for local or remote files without distinction. • Other distributed file systems may be present that support UNIX system calls, and if so, they could be integrated in the same way. • The integration is achieved by a virtual file system (VFS) module, which has been added to the UNIX kernel to distinguish between local and remote files and to translate between the UNIX-independent file identifiers used by NFS and the internal file identifiers normally used in UNIX and other file systems. • In addition, VFS keeps track of the file systems that are currently available both locally and remotely, and it passes each request to the appropriate local system module (the UNIX file system, the NFS client module or the service module for another file system).
NFS protocol • The NFS protocol uses the Sun Remote Procedure Call (RPC) mechanism For the same reasons that procedure calls simplify programs, RPC helps simplify the definition, organization, and implementation of remote services. • The NFS protocol is defined in terms of a set of procedures, their arguments and results, and their effects. Remote procedure calls are synchronous, that is, the client application blocks until the server has completed the call and returned the results. This makes RPC very easy to use and understand because it behaves like a local procedure call. • NFS uses a stateless protocol. The parameters to each procedure call contain all of the information necessary to complete the call, and the server does not keep track of any past requests. This makes crash recovery very easy; when a server crashes, the client resends NFS requests until a response is received, and the server does no crash recovery at all. When a client crashes, no recovery is necessary for either the client or the server.
The file system identifier field is a unique number that is allocated to each file system when it is created. • The i-node number of file is a number that serves to identify and locate the file within the file system in which the file is stored. • The i-node generation number is needed because in the conventional UNIX file system i-node numbers are reused after a file is removed. • The virtual file system layer has one VFS structure for each mounted file system • and one v-node per open file. • A VFS structure relates a remote file system to the local directory on which it is mounted. • The v-node contains an indicator to show whether a file is local or remote. If the file is local, the v-node contains a reference to the index of the local file (an i-node in a UNIX implementation). If the file is remote, it contains the file handle of the remote file.
Client integration • The NFS client module cooperates with the virtual file system in each client machine. • It operates in a similar manner to the conventional UNIX file system, transferring blocks of files to and from the server and caching the blocks in the local memory whenever possible. • It shares the same buffer cache that is used by the local input-output system. • But since several clients in different host machines may simultaneously access the same remote file, a new and significant cache consistency problem arises.
Access control and authentication • Unlike the conventional UNIX file system, the NFS server is stateless and does not keep files open on behalf of its clients. So the server must check the user’s identity against the file’s access permission attributes afresh on each request, to see whether the user is permitted to access the file in the manner requested. • The Sun RPC protocol requires clients to send user authentication information with each request and this is checked against the access permission in the file attributes. • An NFS server provides a conventional RPC interface at a well-known port on each host and any process can behave as a client, sending requests to the server to access or update a file. • The client can modify the RPC calls to include the user ID of any user, impersonating the user without their knowledge or permission.
Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Case Study: Sun NFS Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Case Study: Sun NFS • read(fh, offset, count) -> attr, data • write(fh, offset, count, data) -> attr • create(dirfh, name, attr) -> newfh, attr • remove(dirfh, name) status • getattr(fh) -> attr • setattr(fh, attr) -> attr • lookup(dirfh, name) -> fh, attr • rename(dirfh, name, todirfh, toname) • link(newdirfh, newname, dirfh, name) • readdir(dirfh, cookie, count) -> entries • symlink(newdirfh, newname, string) -> status • readlink(fh) -> string • mkdir(dirfh, name, attr) -> newfh, attr • rmdir(dirfh, name) -> status • statfs(fh) -> fsstats Figure 9.NFSserver operations (NFS Version 3 protocol, simplified) Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Mount service Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Mount service • The mounting of subtrees of remote filesystems by clients is supported by a separate mount service process that runs at user level on each NFS server computer. • On each server, there is a file with a well-known name containing the names of local filesystems that are available for remote mounting. • An access list is associated with each filesystem name indicating which hosts are permitted to mount the filesystem.
Case Study: Sun NFS Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.
Server caching Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
Server caching Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005