1 / 13

The Andrew File System(AFS)

Understand the design and evolution of the Andrew File System (AFS) for improved cache consistency, scalability, and performance with crash recovery solutions. Learn about AFS versions, cache strategies, crash scenarios, scalability limits, and performance comparisons.

furrow
Download Presentation

The Andrew File System(AFS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Andrew File System(AFS) Bojun Seo(bojun@aces.snu.ac.kr) School of Computer Science and Engineering Seoul National University

  2. Contents • Motivation & Goal • AFS version 1 • AFS version 2 • Cache consistency • Crash recovery • Scalability and performance • Summary

  3. Motivation & Goal • NFS is not a good design for scalability • Clients always read or write through network • Except the data in the same block • Clients always check whether the cached contents have been changed or not • This request uses lots of CPU time of a server • Main goal of AFS • Scalability • To serve many clients as possible on a distributed system

  4. AFS version 1 Client Server no yes Exist Cached Modified yes Traverse the path no no yes Copy to client Use cached data Create on client Modified no done yes Update the change • Disk caching in file granularity • Spatial locality: can read data in the same file locally • Temporal locality: unmodified files can be re-read locally • Operating mechanism • File open scenario • File close scenario

  5. AFS version 2 • Pitfalls and solutions of AFS version 1 (1/2) • Traversing the path uses lots of CPU time of the server • Introduce file identifier(FID) • Each file has its own FID • Traverse the path on client and send FID to server • Always check whether the file exists, is cached, is modified or not • Introduce callback mechanism • Send invalidation messages for modified files to clients which have the files

  6. AFS version 2 • Pitfalls and solutions of AFS version 1 (2/2) • Load is not balanced across the servers • Introduce volume • Volume is a tree of files • Can move across servers to balance the load • Can be replicated to read-only cloned copies • Context switching overhead is too big • Use thread instead of process

  7. AFS version 2 Client Server yes Exist Cached Modified no yes Traverse the path no no yes Copy to client Use local data Create on client yes Valid Exist no Traverse the path no yes Create on client Copy to client Use local data • Operating mechanism • File open scenario of version 1 • File open scenario of version 2

  8. AFS version 2 Client Server Modified no done yes Update the change Modified no done yes Update the change Callback • Operating mechanism • File close scenario of version 1 • File close scenario of version 2

  9. Cache consistency • When do the other clients know the modification? • Processes in a single machine • Can be seen after write operation • Processes between different machines • Can be seen after file-close operation • What if different processes write at the same time? • Last writer wins approach • In case of different machines, last closer wins in fact

  10. Cache consistency Example

  11. Crash recovery • Two sort of crashes • Client crash(or reboot) • Server crash(or reboot) • Callback messages can be disappeared during the crashes • In case of client crashes • Treat cached contents as suspect • Recovery scenario • Check whether the contents modified or not • [if modified] copy the contents from the server • [else] use the cached contents • In case of server crashes • Send messages to all clients to treat cached data as suspect • Every client follows the recovery scenario

  12. Scalability and performance • AFS can support about 50 clients • Performance comparison: AFS vs NFS • N: size of file for small, medium, and large • where large means it cannot be cached on memory of the client • L: latency of network, disk, and memory • where

  13. Summary • AFS is for scalability • Main Idea • Cache the data on file granularity • Determine validation on the client side • Consistency model • Inside a single machine • Synchronize at every write operation • Between different machines • Synchronize at every file-close operation • Crash recovery • Treat cached contents as suspect • Scalability and performance • Can support 50 clients • Good for clients with small memory

More Related