500 likes | 633 Views
Distributed File Systems. Yih-Kuen Tsay Dept. of Information Management National Taiwan University. Purposes of a Distributed File System. Sharing of storage and information across a network Convenience (and efficiency) of a conventional file system
E N D
Distributed File Systems Yih-Kuen Tsay Dept. of Information Management National Taiwan University Distributed File Systems [2006/11/06] -- 1
Purposes of a Distributed File System • Sharing of storage and information across a network • Convenience (and efficiency) of a conventional file system • Persistent storage that most other services (e.g., Web servers) need Distributed File Systems [2006/11/06] -- 2
Properties of Storage Systems Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM 1 File system UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (DSM, Ch. 18) Remote objects (RMI/ORB) CORBA 1 1 Persistent object store CORBA Persistent Object Service 2 Peer-to-peer storage system OceanStore (Ch. 10) Types of consistency: 1: strict one-copy. : slightly weaker guarantees. 2: considerably weaker guarantees. Other properties include availability, timing guarantees, etc. Distributed File Systems [2006/11/06] -- 3 Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition.
Files • Files are an abstraction of permanent storage. • A file is typically defined as a sequence of similar-sized data items along with a set of attributes. • A directory is a file that provides a mapping from text names to internal file identifiers. Distributed File Systems [2006/11/06] -- 4
File Attributes Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 5
File Systems • Responsible for the (a) organization, (b) storage, (c) retrieval, (d) naming, (e) sharing, and (f) protection of files. • Provide a set of programming operations that characterize the file abstraction, particularly operations to read and write subsequences of data items beginning at any point of a file. Distributed File Systems [2006/11/06] -- 6
File System Modules A basic distributed file system implements all of the above plus modules for client-server communication and distributed naming and location of files. Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 7
UNIX File Operations Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 8
Distributed File System Requirements • Transparency: access, location, mobility, performance, and scaling transparency. • Concurrency (and Consistency) • Replication/Caching (and Consistency) • Hardware/operating system heterogeneity • Fault-Tolerance • Security (Access Control, Authentication) • Efficiency Distributed File Systems [2006/11/06] -- 9
A File Service Architecture Note: The modules communicate with one another by remote procedure calls. Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 10
File Service Components • Flat file service: implementing operations on the contents of files, which are referred to by unique file identifiers (UFIDs) • Directory service: mapping text names of files (including directories) to their UFIDs • Client module: integrating and extending the previous two services under a single application programming interface * Why is this structure more open and configurable? Distributed File Systems [2006/11/06] -- 11
Flat File Service Operations Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 12
Difference from UNIX • Immediate access to files using UFIDs (without open or close) • Read or write starts at the position indicated by a parameter • All operations, except create, are repeatable • Allows a stateless implementation Distributed File Systems [2006/11/06] -- 13
Access Control • Conventional access rights checks (at open calls) not feasible • Two ‘stateless’ approaches: * Capability (by manipulating the UFID) * User identity sent with every request (adopted in NFS and AFS) • Main problem: forged requests; some authentication mechanism is needed Distributed File Systems [2006/11/06] -- 14
Capabilities and UFIDs A capability is a binary value that acts as an access key; it can be encoded in the UFID. • Basic construction of a UFID: file group id + file number + random number • Additional field: permissions • Additional field: encryption of the permission field Distributed File Systems [2006/11/06] -- 15
Directory Service Operations Note: Each directory is stored as an ordinary file with a UFID. Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 16
The Network File System (NFS) • Introduced by Sun Microsystems in 1985, now an Internet standard • Runs on top of RPC (RFC 1831) • Implemented on most operating systems • Version described here: UNIX implementation of NFS Version 3 (RFC 1813, June 1995) • Most recent version: NFS Version 4 (RFC 3010, December 2000) Distributed File Systems [2006/11/06] -- 17
NFS Architecture Note: Each computer can act as both a client and a server. Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 18
The Virtual File System Module • Access transparency • File handles (file identifiers): • ‘filesystem indentifier’ + ‘i-node number’ + ‘i-node generation number’ • One VFS structure for each mounted filesystem • relates a remote filesystem (identified by its file handle obtained at mount time) to a local directory on which it is mounted • One v-node per open file • indicates whether a file is local (i-node) or remote (file handle) Distributed File Systems [2006/11/06] -- 19
The NFS Client Module in UNIX • Integrated with the kernel • Emulates the UNIX file system primitives • A single client module serves all user-level processes • The encryption key for authentication stored in the kernel • Caches file blocks • There is a consistency problem Distributed File Systems [2006/11/06] -- 20
Access Control and Authentication • Stateless servers • The user’s identity checked afresh on each request • Authentication information supplied automatically by the RPC system • Security loophole: the client can modify a RPC call to impersonate any user • An encryption option closes this loophole • Securing NFS with Kerberos • Full authentication done when files are mounted • A server retains the current mounts (including user authentication data) at each client Distributed File Systems [2006/11/06] -- 21
NFS Server Operations Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 22
NFS Server Operations (cont’d) Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 23
Remote File Accesses Note 2: a pathname is resolved to an i-node in an iterative manner using lookup. Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 24
Automatic Mounting • Mount filesystems when they are referenced and unmount them when they are no longer needed • Implementations: automount (later version: autofs)and amd • A simple form of read-only replication can be achieved • Fault tolerance • Load balance Distributed File Systems [2006/11/06] -- 25
Sample File System Information in UNIX saturn:~ 35 % df -k Filesystem kbytes capacity Mounted on /dev/dsk/c0t3d0s0 143903 91% / /dev/dsk/c0t3d0s6 267943 99% /usr /dev/dsk/c0t3d0s3 15383 3% /tmp galaxy:/usr/local.real 4030440 53% /usr/local lucky:/var/mail.real 564648 86% /var/mail cosmos:/home.real/student/xxx 3941760 60% /home/xxx galaxy:/home.real/faculty/yyy 2964512 51% /home/yyy * Note: The output of ‘df -k’ has been edited. Distributed File Systems [2006/11/06] -- 26
Caching – Server Caching • Similar to conventional UNIX’s buffer cache • read-ahead • delayed-write • Extra measures for write operations • delayed-write with the commit operation (default) • write-through (to ensure failure independence) Distributed File Systems [2006/11/06] -- 27
Caching – Client Caching • Caching results of read, write, getattr, lookup, and readdir • Cache validation based on timestamps • last-validated timestamp and freshness interval • last-modified timestamp • Trade-off between consistency and efficiency • Piggybacking of file attribute values • The bio-daemon processes for implementing read-ahead and delayed-write caching at the client side Distributed File Systems [2006/11/06] -- 28
Achievements of NFS • Access and location transparency • Mobility transparency (partially) • Read-only file replication: automatic mounting • Fault-tolerance: stateless servers, automatic mounting • Efficiency: caching of disk blocks (main problem: frequent use of getattr) Nonachievements: scalability, concurrency and consistency, security (Kerberos), ... Distributed File Systems [2006/11/06] -- 29
The Andrew File System (AFS) • Developed at CMU • Current versions: AFS-2, AFS-3 • Compatible with NFS • Main achievement over (older) NFS: better scalability by minimizing client-server communication • Key characteristics: whole-file serving and caching (partial file caching allowed in AFS-3) Distributed File Systems [2006/11/06] -- 30
Observations on UNIX File Usage • Files are mostly small • Read operations are more common • Sequential accesses are more common • Most files are written by one user • Files are referenced in burst Distributed File Systems [2006/11/06] -- 31
AFS Architecture Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 32
AFS File Name Space Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 33
System Call Interception in AFS Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 34
AFS System Calls Implementation Distributed File Systems [2006/11/06] -- 35 Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition.
Cache Consistency • A callback promise is provided when Vice supplies a copy of file to a Venus process • The callback promise stored with the cached copy is in either valid or cancelled state • When Venus handles an open, it checks the cache. Distributed File Systems [2006/11/06] -- 36
The Vice Service Interface Source: Coulouris et al., Distributed Systems: Concepts and Design, Fourth Edition. Distributed File Systems [2006/11/06] -- 37
Enhancements to NFS and AFS • Spritely NFS • add open and close, use callbacks • NQNFS (Not Quite NFS) • use callbacks and leases • WebNFS • allow browsers and other applications to interact with an NFS server directly • NFS Version 4 (RFC 3010, December 2000) • incorporating all of the above and more • DCE/DFS (based on AFS) • use callbacks and write tokens (with a lifetime) Distributed File Systems [2006/11/06] -- 38
New Features of NFS Version 4 • Adoption of the RPCSEC_GSS (RFC 2203) security protocol • Multiple operations in one request • Better migration and replication abilities • A client may query the location(s) of a file system. • Introduction of open and close operations • Lease-based file locking • Callback-based delegation of files Distributed File Systems [2006/11/06] -- 39
New Design Approaches • Background • high-performance storage technology (e.g., RAID) • log-structure file systems (e.g., Sprite, BSD LFS) • high-performance switched networks (e.g., ATM, high-speed Ethernet) • Goals: high scalability and fault-tolerance • Main ideas: distribute file data among many nodes, separate responsibilities, … • Constraints: high level of trust Distributed File Systems [2006/11/06] -- 40
More Recent File System Designs • xFS • Serverless: all data, metadata, and control can be located anywhere in the system; any machine can take over the responsibilities of a failed one • Frangipani • Two-layer structure • the Petal distributed virtual disk system • the Frangipani server module Both designs utilize RAID-style striping, log-structured file storage, etc. Distributed File Systems [2006/11/06] -- 41
Log-based Striping in xFS Source: T.E. Anderson et al., Serverless Network File Systems, ACM TOCS 1996 Distributed File Systems [2006/11/06] -- 42
An xFS Configuration Source: T.E. Anderson et al., Serverless Network File Systems, ACM TOCS 1996 Distributed File Systems [2006/11/06] -- 43
A Frangipani Configuration Distributed File Systems [2006/11/06] -- 44 Source: C.A. Thekkath et al., Frangipani, A Scalable Distributed File System, ACM SOSP 1997
Storage Systems Distributed File Systems [2006/11/06] -- 45 Source: G.A. Gibson and R. van Meter, Network Attached Storage Architecture, CACM, November 2000.
NAS and SAN Note: the difference is disappearing. Distributed File Systems [2006/11/06] -- 46 Source: G.A. Gibson and R. van Meter, Network Attached Storage Architecture, CACM, November 2000.
Bandwith for Disk Access Source: E. Riedel, Storage Systems, Queue, June 2003. Distributed File Systems [2006/11/06] -- 47
Increasing the Bandwith Source: E. Riedel, Storage Systems, Queue, June 2003. Distributed File Systems [2006/11/06] -- 48
Virtualization in SAN Distributed File Systems [2006/11/06] -- 49 Source: E. Riedel, Storage Systems, Queue, June 2003.
Requirements for Storage Systems • Basic requirements: resource consolidation, rapid deployment, central management, convenient backup, high availability, data sharing. • Geographic separation • Security against an increasing risk of unauthorized access • Performance scalable with capacity (accesses per second or megabytes per second) Distributed File Systems [2006/11/06] -- 50