380 likes | 635 Views
Distributed File Systems. DFS Desirable Features. Transparency: Access transparency: a single set of operations Location transparency: uniform file name space Mobility transparency: file mobility Performance transparency: Comparable to a centralized file system
E N D
Distributed File Systems CSS434 DFS
DFS Desirable Features • Transparency: • Access transparency: a single set of operations • Location transparency: uniform file name space • Mobility transparency: file mobility • Performance transparency: Comparable to a centralized file system • Concurrency and synchronization:should complete concurrent access requests consistently. • Forward/backward validation • File caching and replication: • Caching: at client/server for scalability • Replication: at multiple servers for availability • Heterogeneity:should allow a variety of nodes to share files in different storage media and OS • Similarity between Unix and NTFS: stream-oriented files, a tree-structured system • Difference between Unix and NFTS: CR char included in NTFS, file naming • Fault tolerance: at-most-once or at-least-once semantics • Consistency: Unix one-copy update semantics, session semantics, etc. • Security:should protect files from network intruders. CSS434 DFS
Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM 1 File system UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (Ch. 16) Remote objects (RMI/ORB) CORBA 1 1 Persistent object store CORBA Persistent Object Service Persistent distributed object store PerDiS, Khazana Consistency Maintenance in Various Storage Systems CSS434 DFS
Client computer Server computer Directory service Application Application program program Flat file service Client module File Service Architecture (File caching/replication) (File caching) Consistency maintenance CSS434 DFS
DFS Services • Flat file service • File-accessing mechanism: deciding a place to manage remote files and unit to transfer data (at server or client? file, block or byte?) • File-sharing semantics:providing similar to Unix but weaker file update semantics • File-caching mechanism:improving performance/scalability • File-replication mechanism: improving performance/availability • Directory service • Mapping between text file names and reference to files, (i.e. file IDs) CSS434 DFS
Read(FileId, i, n) -> Data If 1 ≤ i ≤ Length(File): Reads a sequence of up to n items — throws BadPosition from a file starting at item i and returns it in Data. Write(FileId, i, Data) If 1 ≤ i ≤ Length(File)+1: Writes a sequence of Data to a — throws BadPosition file, starting at item i, extending the file if necessary. Create() -> FileId Creates a new file of length 0 and delivers a UFID for it. Delete(FileId) Removes the file from the file store. GetAttributes(FileId) -> Attr Returns the file attributes for the file. SetAttributes(FileId, Attr) Sets the file attributes (only those attributes that are not shaded in ). Flat File Service Operations CSS434 DFS
Lookup(Dir, Name) -> FileId Locates the text name in the directory and returns the — throws NotFound relevant UFID. If Name is not in the directory, throws an exception. AddName(Dir, Name, File) If Name is not in the directory, adds (Name, File) to the — throws NameDuplicate directory and updates the file’s attribute record. If Name is already in the directory: throws an exception. UnName(Dir, Name) If Name is in the directory: the entry containing Name is — throws NotFound removed from the directory. If Name is not in the directory: throws an exception. GetNames(Dir, Pattern) -> NameSeq Returns all the text names in the directory that match the regular expression Pattern. Directory Service Operations host3 host2 host1 Name1 Name3 Name2 addName( Dir, Name, file) file Ref count=3 if ref_count = 0, file deleted Dir CSS434 DFS
NFS File-Accessing Models • Accessing Remote Files File access Merits Demerits Remote service model At a server A simple implementation Communication overhead Data caching model At a client that cached a file copy Reducing network traffic Cache consistency problem • Unit of Data Transfer CSS434 DFS
File-Sharing Semantics • Define when modifications of the file data made by a user are observable by other users • Unix semantics • Session Semantics • Immutable shared-files semantics • Transaction-like semantics CSS434 DFS
Append(c) t1 t2 t3 t4 t5 t6 a b File-Sharing SemanticsUnix Semantics (One-copy Update Semantics) Absolute Ordering (seen to all clients as if only a single copy existed and is updated immediately) Client A Append(e) read delayed a b a b c c a b c d b d d b b a c e a a c e c a b delayed Append(d) read Client B Network Delays (Inevitable to have a weaker semantics) CSS434 DFS
a a a b b b a a b b x c a a b b x c d y a a a a a a a b b b b b b b c c x c c c x y y d d d d d e z z e e e e File-Sharing SemanticsSession Semantics Client C Server Client A Client B Open(file) Append(c) Open(file) Append(d) Append(x) Append(e) Append(y) Close(file) Append(z) Open(file) Close(file) Append(m) m Close(file) m File writes may overwrite previous updates. File lock is needed to prevent this overwrites. CSS434 DFS
a a a b b b a a a a a a a b b b b b b b c x c x x c x File-Sharing SemanticsSession Semantics with File Lock Client B Server Client A file Open(file) lockt Append(c) Open(file) lockt User need to choose: quit, steal, or proceed Append(x) ^x^w X Close(file) ^x^s Close(file) User need to choose: Quit, save anyway, or type ^x^w file2 file X file3 CSS434 DFS
R1 R2 W3 R4 W5 R1 R2 W3 R4 W5 R1 R2 W6 R4 W7 R1 R2 W9 R4 W8 File-Sharing SemanticsTransaction-Like Semantics (Concurrency Control) Forward validation Backward validation Client A Client B Client A Client B Client C Client D Client C Client D Trans_start Trans_start Compare write with later reads Compare reads with former writes Trans_start Trans_start R1 R2 W6 R4 W7 Trans_start Trans_start R1 R2 W9 R4 W8 validation validation Commitment Commitment Trans_start Trans_start Trans_end Trans_end R1 R2 R6 R8 W8 R1 R2 R6 R8 W8 Trans_abort Trans_restart Trans_end Trans_end Trans_end Abort itself or conflicting active transactions Trans_abort Trans_restart Trans_end Which validation is better? CSS434 DFS
File-Sharing SemanticsImmutable Shared-Files Semantics Server Client B Client A Version 1.0 Tentative based on 1.0 Tentative based on 1.0 Version 1.1 Version conflict Abort Depend on each file system. Abortion is simple (later, the client A can Decide to overwrite it with its tentative 1.0 by changing the corresponding directory) Version 1.2 Version 1.2 Merge Ignore conflict CSS434 DFS
File-Caching SchemesCache Location Node boundary Client Server Main memory Main memory copy copy Disk Disk copy file CSS434 DFS
W Immediate write copy new W W File-Caching SchemesModification Propagation Client 1 Client 2 • Write-through scheme • Pros: Unix-like semantics and high reliability • Cons: Poor write performance • Delayed-write scheme • Write on cache displacement • Periodic write • Write on close • Pros: • Write accesses complete quickly • Some writes may be omitted by the following writes. • Gathering all writes mitigates network overhead. • Cons: • Delaying of write propagation results in fuzzier file-sharing semantics. Main memory Main memory copy Disk file Client 1 Client 2 Main memory Main memory new copy copy W W Disk delayed write file CSS434 DFS
W W W W File-Caching SchemesCache Validation Schemes – Client-Initiated Approach Client 1 Client 2 • Checking before every access (Unix-like semantics but too slow) • Checking periodically (better performance but fuzzy file-sharing semantics) • Checking on file open (simple, suitable for session-semantics) • Problem: High network traffic Main memory Main memory copy copy Disk Write through Check before every access Delayed write? file Client 1 Client 2 Main memory Main memory copy new W copy Disk W Check-on-open Write-on-close Check-on-close? file CSS434 DFS
File-Caching SchemesCache Validation Schemes – Server-Initiated Approach Client 1 Client 4 Client 2 Client 3 • Keeping track of clients having a copy • Denying a new request, queuing it, and disabling caching • Notifying all clients of any update on the original file • Problem: • violating client-server model • Stateful servers • Check-on-open still needed for the 2nd file opening. Main memory Main memory Main memory Main memory copy copy copy W Deny for a new open W W Disk Notify (invalidate) Write through Or Delayed write? file W CSS434 DFS
Homework Assignment 4 Client 1 Server Client 2 • Session semantics • Client-side/server-side caching • Server-initiated invalidation invalidate( ) writeback( ) invalidate( ) writeback( ) download( ) upload( ) chmod 600 chmod 400 file1 file1 file1 file2 /tmp cwd /tmp emacs emacs CSS434 DFS
File Access Improvements • Data sieving for a single client • Read a larger contiguous file portion • Extract actual file portions from it • Collective I/O for multiple clients • Read contiguous space, thereafter distribute sub spaces to each client • Disk-directed I/O • Server-directed I/O • Two-phase I/O (Clients-directed) CSS434 DFS
Data Sieving User’s request for non-contiguous file portions Read a larger contiguous block into memory Copy requested portions into user’s buffer (from R. Thakur’s Data Sieving and Collective I/O in ROMIO, 1998) CSS434 DFS
Two-Phase I/O P0 P1 P0 Redistribute Read contiguous P1 Redistribute Read contiguous P2 P3 P2 Read contiguous Redistribute P3 Read contiguous Redistribute CSS434 DFS
File Stripes Transfer in a Hierarchy(from Fukuda/Miyauchi Journal of Supercomputing) key value GUI read files commander Id: 0 128_inputFile1_1 128_inputFile1_1 contents contents 528 32_inputFile1_0 32_inputFile1_0 contents contents 32_inputFile2_0 contents 32_inputFile2_0 contents 528_inputFile2_7 contents 528_inputFile2_7 contents 128_inputFile1_1 contents 528_inputFile1_7 contents 528_inputFile1_7 contents root sentinel Id: 2 32_inputFile1_0 contents 32_inputFile2_0 contents sentinel Id: 8 32 sentinel Id: 9 128 528 528_inputFile2_7 contents 528_inputFile1_7 contents sentinel Id: 32 sentinel Id: 33 sentinel Id: 36 sentinel Id: 37 sentinel Id: 38 sentinel Id: 39 sentinel Id: 131 sentinel Id: 132 sentinel Id: 128 sentinel Id: 129 sentinel Id: 130 128_inputFile1_1 contents 32_inputFile1_0 contents sentinel Id:528 CSS434 DFS 32_inputFile2_0 contents
DFS ExampleSun NFS Server Client B Client A / / / usr usr bin opt bin bin org shared shared export export User process User process VFS VFS VFS NFS server NFS client Local FS Local FS NFS client Local FS RPC stub RPC stub RPC stub CSS434 DFS
Sun NFSInstallation • Server: • Check if NFS is running: rpcinfo –p • Start NSF: /etc/rc.d/init.d/nfs start • Edit /etc/exports file: /dir/to/export client1(permissions), client2(… • Export dirs in /etc/exports: exportfs –a • Check exported directories: showmount –e • Client: • Import a server’s directory: mount –o options server_name:/dir /my_dir • bg: continue working on importing upon a failure, • intr: a process will be interupted if its I/O request to the server dir is pending. • soft: allowing a client to time out the connection after a number of retries • rw/ro: normal r/w or read only • Underlying Connections: portmapper client NFS mount service port mountd permission portmapper 2049 nfs rpc CSS434 DFS
Sun NFSOverviews • Communication • RPC: a compound procedure • Lookup, Open, and Read • Server status • Stateless: simple implementation in ver 3. • Statefull: allowing clients to cache files in ver 4. • RPC call back from a server to invalidate a client’s cache • Synchronization • Session semantics • File Locking in ver 4: lock, lockt, locku, and renew • Ex. Emacs: Tests with lockt when modifying buffer, locks a file with lockt, and unlock with locku after writing buffer contents to the file. • Share reservation: specify how to share a file (with ro, wo, or r/w) CSS434 DFS
SUN NFSOverviews (Cont’d) • Caching • In client’s memory • Session semantics • Revalidation of client’s cache upon re-opening the same file • Open delegation: • A server delegates a open decision to a writing client which can handle an open request from other clients on the same machine. • A server calls back the client when receiving an open request from another machine. • Fault Tolerance • RPC failure: use a duplicate-request cache • File locking failure: provide a grace period during which a client reclaim locks previously granted and the server builds up its previous state. CSS434 DFS
Sun NFSDuplicate Request Cache server server server client client client XID = 1234 XID = 1234 XID = 1234 XID = 1234 Too soon, ignore Too soon, ignore Transaction completed Transaction completed Transaction completed XID = 1234 reply reply reply Just replied, ignore XID = 1234 reply Then, when does the server delete this cached result? CSS434 DFS
DFS ExampleAndrew File System CSS434 DFS
AFSFile Name Space Client Server / / usr usr tmp tmp Shared Local bin bin Symbolic links Symbolic links Vice process Venus process User process Unix Kernel (Unix FS) Unix Kernel (Unix FS) cache CSS434 DFS
AFSSystem Call Interception CSS434 DFS
AFSImplementation of file system calls CSS434 DFS
2: Log them in a segment 3: Collaborative caching (Read data from another client if possible) 1: Write requests 3: Fragment a segment and sent them to a strip group of servers 2: Query a manager 1: Read request DFS ExampleXFS Metadata Manager Storage Server Metadata Manager Client Storage Server Storage Server Client LAN CSS434 DFS
DFS ExamplePlan 9 Client / Union directory ex in N import net a b x y c a d import export import File server 2 File server 1 Computation server Network Interface d3 d1 d2 N Network access a b c a d x y net Internet Remote execution CSS434 DFS
Paper Review by Students • Sun NFS • Andrew File System • XFS • Plan 9 • LFS • Discussions • What file-sharing semantics is each system based on? • Which systems use server-side caching? • Which systems use client-side caching? • Which systems use the client-initiated validation? • Which systems use the server-initiated validation? CSS434 DFS
Non-Turn-In Exercises Q1.In transaction-like semantics a.k.a. concurrency control, compare the pros and cons of backward and forward transactions. In particular, consider the case where each transaction includes more read than write operations. Backward transaction Pros: Cons: Forward transaction Pros: Q2. Answer the following five questions about file-caching. When you are asked to show which systems use a given caching scheme, choose all applicable systems from NFS, AFS, xFS and Plan9. Q2-1. Why can file-caching contribute to performance improvement? Answer two reasons. Reason 1: Reason 2: Q2-2. State one merit for using server-side caching? Which system uses server-side-caching? Merit: System: Plan9 (Answer) CSS434 DFS
Non-Turn-In Exercises Q2-3. Client-side caching allows multiple clients to cache the same file. There are two schemes to validate the contents of a locally-cached file (or invalidate the contents of the same file cached at remote clients.) Those are client-initiated and server-initiated validations. Does the client-initiated validation require a file server to be stateful? Justify your answer. Also show which systems use the client-initiated validation. Stateless or stateful? Reason: Systems: NFS, Plan9 (Answer) Q2-4. Does the server-initiated validation require a file server to be stateful? Justify your answer. Also show which system uses the server-initiated validation. Stateless or stateful? Reason: System: AFS, xFS (Answer) CSS434 DFS