100 likes | 222 Views
AFS (vs. NFS). Client-side chaching Stateful vs. stateless server Consistency (write protocol) Server/Client Crash. Client-side Caching. NFS: Block-level caching Only cache in memory Read a block : If block is not in memory, read from the server Disadvantages
E N D
AFS (vs. NFS) Client-side chaching Stateful vs. stateless server Consistency (write protocol) Server/Client Crash
Client-side Caching NFS: • Block-level caching • Only cache in memory • Read a block: If block is not in memory, read from the server • Disadvantages • Can only hold NFS data as big as the client’s memory • Bad for wide-area network (must fetch the file if file is not in memory) • Bad for disconnected client (if part of the file is not in memory, the client cannot fetch the rest from the server) AFS: • Whole-file caching • Also cache on the disk • Each client has an “afs partition” on its disk, used for caching the file • Open(fileA) All content of fileA is cached in client’s memory and disk • If file is large, AFS only sends 64 blocks at a time • Advantages • Can setup a big AFS local disk partition to store more files • Good for wide-area-network (fetch from local disk rather than from the server (need to go across the network)) • Good for disconnected client (if file exists in local disk, can read the file without connection)
AFS: Stateful Server Why stateful server? • Implication of whole-file caching in local disk: need to know when a client needs to invalidate local cache Two bad options • Polling • Client periodically polls server asking if the file has been updated or not • Why polling is bad? Hard to define how often a client needs to bug the server • Too frequent: too much network load (especially if hundreds of clients and thousands of files) • Too rare: weak consistency (not see the latest update) • Check on open() • On open(), ask the server if file is outdated. If so, read from the server rather than from local cache • Problem: many files/operations are read-only • Ex: path traversal: cd /mnt/afs/dir1/dir2/dir3 • In AFS, dir1-3 can be cached in local disk. We don’t want to contact the server everytime we open a file or do a path traversal. • Again, imagine hundreds of clients reading thousands of files. Server can easily become the bottleneck.
AFS Callbacks Stateful server: AFS callbacks • AFS server records who has the copy of a file • Server informs clients if a file has been modified by another client (via callbacks) Big advantage: • Network load is reduced! • Client can open and read file with no server interaction
AFS Write Policy Question: How often should a client write to servers NFS: • (block-level caching) • Hence, use periodic timer to flush dirty blocks only AFS: • Write-back file’s data on file close • Close(A) all A’s content is sent to the server • Implication: Updates are only visible on close() • E.g. if you open the file, write the file, but never close the file for 1 hour. The server will not see your update. • Advantage: • Temporary files are not stored to the server • Ex: gcc creates lots of temporary files when compiling • When these files are deleted right after close, the modifications are not sent to the server • Optimization: Wait a while after close() to see if the file will be deleted or not
Multiple Writers Example • Server has a file with these blocks: B1 B2 B3 B4 • Client X and Y opens the file at roughly the same time • Client X modifies B1’ B3’ close() • Client Y modifies B2” B4” close() NFS: last-block-writer wins • Final content at the server: B1’ B2” B3’ B4” AFS: last-close (or last-writer wins) • Final content at the server: B1 B2” B3 B4” • When X close(), [B1’ B2 B3’ B4] are sent to the server. The server sends invalidation to Y, but Y already opens the file! Hence, cannot preempt Y to release/reupdate the file. • When Y close(), client Y is the last writer. When client Y opens the file, client Y caches B1-4 in local cache. But only update B2 and B4. Upon close() all content of the file in Y’s cache [B1 B2” B3 B4”] is sent to the server. Hence, the last writer wins!
Close-to-Open Consistency AFS consistency semantic is also called close-to-open consistency model • i.e. if a client X opens a file after another client Y close the modified file, AFS guarantees that X will see the latest update Example: • Server has a file with these blocks: B1 B2 B3 B4 • Client X open(), modifies B1’ B3’, and close() • Server will get [B1’ B2 B3’ B4] • Then, Client Y open(), modifies B2” B4”, and close() • When X closes the file, Y gets an invalidation from the server • Hence, when client Y opens the file, it gets the latest update by X, i.e. [B1’ B2 B3’ B4] • When Y closes the file, it sends [B1’ B2” B3’ B4”] to the server
Multiple-writers/readers Revisit the readers/writers synchronization problem: • Fully synchronized solution • If a write exists, no other writers/readers can exist because the lock is held by the single writer Why full-consistency model is bad in DFS? • A file can be locked indefinitely • A malicious client: open(file, WRITE_MODE) • The server grants the lock for this client • The client never closes the file (lock is never released) • Another clients cannot read the file until the lock is released. Bad! • Plus, distributed locking is hard! • Other considerations: • Most files are updated by one user • Most workload are read workload
Crash in stateful server Stateful server has to deal with client/server crash: Server crash: • Server maintains a list of callbacks (but not necessary stored in the disk) • If a server crash, the list is gone • Upon a reboot, the server needs to rebuilt the list • Hence, AFS server needs to communicate to all clients to find out cached files at clients • (NFS is stateless server, hence upon a crash, NFS server does not need to do anything) Client crash: • A client X is caching a file A, and reboot the system • During reboot, A is modified at the server, the server sends a revoke message to client X, but client X is down. Hence, the revocation message is lost. • After X reboots, X does not know that cached A has been outdated • Solution: invalidate client cache after client reboot