330 likes | 600 Views
AFS. Made By Andrew Carnegie & Andrew Mellon Carnegie Mellon University Presented By Christopher Tran & Binh Nguyen. AFS: Andrew File System. Abstraction of DFS from users Accessing a file is similar to using a local file Scalability with region distribution
E N D
AFS Made By Andrew Carnegie & Andrew Mellon Carnegie Mellon University Presented By Christopher Tran & Binh Nguyen
AFS: Andrew File System • Abstraction of DFS from users • Accessing a file is similar to using a local file • Scalability with region distribution • Permissions Control with Access Control Lists • University Environment (Large number of users) • Weak Consistency by Design
AFS: Primary Features • Implemented in UNIX at the system call level • Work Unit is the entire file • Applications and users are unaware of distributed system • Kerberos Authentication for use over insecure networks • Access Control Lists (ACLs) control permissions • File Consistency through stateful servers
AFS: Implementation Overview • Application Opens file stored on AFS server • System Call is intercepted by a hook in the workstation Kernel (Venus) • Andrew Cache Manager Checks for local copy of file • Andrew Cache Manager Checks for callback status • If needed, Andrew Cache Manager forwards request to file server • If needed, Andrew Cache Manager receives file and stores on local machine • File Descriptor is returned to application
AFS: Version 1 • Clients would constantly check with the server for consistency • Message intervals • Every message would include authentication information • Server has to authenticate source • Messages included full path to the file • Server had to traverse directories • Approximately 20 clients per server (in 1988)
AFS: Version 1 problems • Servers spending too much time communicating with clients • Clients constantly checking if a file is consistent increasing network traffic • Server constantly authenticating messages using CPU time • Server traversing directories every read, write, and file check using CPU time
AFS: Version 2 • Callback Mechanism – Server promises to inform clients of file change • Stateful Server • 50 Clients per server (in 1988) • Clients request file based on FID • Volumes can exist on any server I’ll let you know if something changes
AFS: Callback • Server Keeps track of clients using threads • Each Client is managed by a separate thread • Client and Server use RPC that to respective daemons • Server has Vice daemon • Client has Venus daemon • Each file a client opens also gets a AFSCallback Structure. • AFSCallback contains an expiration for how long the callback is valid and how the server will communicate with the client • Clients assume that file is consistent until server callback is received or the expiration time lapses.
AFS: Callback Invalidation 2. Write(412) FID: 412 FID: 412 VICE Daemon 3. invalidate(412) 1. Store(412) FID: 412 FID: 492
AFS: Callback Issues • No description of why the callback was initiated • Modified portions • Appended data • Saved but no data changed • File Moved • Etc • Client has to redownload entire file when reading • No support for differential update • If application reads more data, file is re-downloaded but updates may not be reflected in application • If user is reading past the changes in a file, the application is unaware of such changes.
AFS: Volumes • Collection of files • Does not follow directory path • Mounted to a directory • Venus on client maps the pathname to a FID • Vice on server gets file based on FID • Less directory traversal
AFS: Server Scalability • Server Replication: Multiple Servers act as a single logical server • Server keeps track of clients in System Memory using threads • Clients have a heartbeat to the server to make sure server is alive • Volumes can be located on any server and moved to any other server • Volume Read-Only clones used to distribute across physical space • All servers share the same common name space • /afs/…….. • Local server name space can be unique where volumes are mounted • /afs/server2 • /afs/home/server3 • AFS servers have links to other AFS servers for Volume Locations • Servers know which server has a volume with specific files
AFS: Fault Handling • Client Crash – Worst Case Scenario • Upon boot to OS: check local cache against server for consistency • Server Crash – Start Fresh • Clients detect server crashed from missed heartbeats • Upon connection: clients re-establish communication • Server rebuilds client list • Clients check file consistency Uptime 0 seconds Let’s GO! I crashed or server crashed, I’m probably wrong
AFS: weak consistency • Condition • Two or more clients have file open • Two or more clients modify file • Two or more clients close file to be written • Result • Client that sends store() and received by server LAST is the current file I got here LAST, I WIN! I got here FIRST!
AFS: Why Weak Consistency • Majority of all DFS access is reading files • In a University, Users are rarely modifying files simultaneously. • Users work out of home directories • Simplicity in Implementation • Allows multiplatform implementation • Does not add complexity to crash recovery • No need to resume from a crash point
AFS: Access Control Lists • Standard Unix/Linux permissions are based on Owner/Group/Other • ACLs allow refined control per user/group Example, you want to share a directory with only one other person so they can read files. Linux/Unix: make group, give group read access, add user to group ACLs: Add user/group with read permissions Months later: you want to give someone read/write access Linux/Unix: can’t do it without giving “other” group read access and everyone now has read access ACLs: Add user/group with read/write permissions
AFS: Advantages • First Read performance is similar to other DFS • Second Read performance is improved in almost all cases since read requests are far greater than write requests • Creating new files is similar in performance with other DFS • Use of ACLs over default file system permissions • For read-heavy scenarios, supports a larger client-server ratio • Volumes can be migrated to other AFS servers without interruption • Kerberos Authentication allows access over insecure networks • Build into Kernel so user login is authentication and UNIX/Linux applications can use AFS without modifications
AFS: DISADVANTAGES • Entire file must be downloaded before file can be used • Causes a noticeable latency when accessing files the first time • Modifications require entire file to be uploaded to server • Short reads in large files is much slower than other DFS • No simultaneous editing of files
AFS: Contributions • AFS highly influences NFS v4 • Basis of the Open Software Foundations Distributed Computing Environment • Framework for Distributed Computing in the Early 1990s Current Implementations • Open AFS • Aria • Transarc (IBM) • Linux Kernel v2.6.10
AFS: Suggested Ideas • Automatic Download of file when server sends consistency invalidation • Smart invalidation by determining if a user needs to redownload • If a user is beyond the changes of a file, no need to redownload entire file. • Supporting differential updates • Only sending information on what changed
AFS Performance • Andrew Benchmark (Still sometimes used today) • Simulation of typical user • Multi Stage Benchmark • File Access • File Write • Compiling Program • Response Time in creating various sized files in and out of AFS servers • How long until file is available to be used? • AFS performance was around half in comparison to a file stored locally on a hard drive
AFS Performance Varying the count of small files
AFS Performance Varying the size of one file
AFS Performance • Largest impact is when making lots and lots of small files or very large files • The extra overhead is directly proportional to the total number of bytes in the file(s) • Each individual file has its own additional overhead, but until the number of files get very large, it is not easy to detect
AFS Performance • AFS: server-initiated invalidation • NFS: client-initiated invalidation • Server-initiated invalidation performs better than client-initiated invalidation
AFS Performance Network Traffic Comparison
Bibliography "AFS and Performance." University of Michigan. Web. Accessed 16 May 2014. <http://csg.sph.umich.edu/docs/unix/afs/> "Andrew File System." Wikipedia. Wikimedia Foundation, 05 July 2014. Web. 16 May 2014. <http://en.wikipedia.org/wiki/Andrew_File_System> "The Andrew File System." University of Wisconsin. Web. Accessed 16 May 2014. <http://pages.cs.wisc.edu/~remzi/OSTEP/dist-afs.pdf> Coulouris, George F. Distributed Systems: Concepts and Design. 5th ed. Boston: Addison-Wesley, 2012. Print. John H Howard, "An Overview of the Andrew File System", in Winter 1988 USENIX Conference Proceedings, 1988 M. L. Kazar, "Synchronization and Caching Issues in the Andrew File System", In Proceedings of the USENIX Winter Technical Conference, 1988