1 / 55

Scale and Performance in a Distributed File System

Scale and Performance in a Distributed File System. John H. Howard et al. ACM Transactions on Computer Systems, 1989 Presented by Gangwon Jo, Sangkuk Kim. Andrew File System. Andrew Distributed computing environment for Carnegie Mellon University

thea
Download Presentation

Scale and Performance in a Distributed File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scale and Performancein a Distributed File System John H. Howard et al. ACM Transactions on Computer Systems, 1989 Presented by Gangwon Jo, Sangkuk Kim

  2. Andrew File System • Andrew • Distributed computing environment for Carnegie Mellon University • 5,000 – 10,000 Andrew workstations in CMU • Andrew File System • Distributed file system for Andrew • Files are distributed across multiple servers • Presents a homogeneous file name space to all the client workstations

  3. Andrew File System (contd.) Servers Disks. Disks. Disks. Unix Kernel Unix Kernel Unix Kernel Vice Vice Vice Network Clients User Prog. Venus User Prog. Venus User Prog. Venus Unix Kernel Unix Kernel Unix Kernel Disk. Disk. Disk.

  4. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks Unix Kernel Vice Network User Program Venus Unix Kernel Disk

  5. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk

  6. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk

  7. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk

  8. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk A

  9. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus read/write Unix Kernel Disk A

  10. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus read/write Unix Kernel Disk A

  11. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus close(A) Unix Kernel Disk A’

  12. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A Unix Kernel Vice Network User Program Venus close(A) Unix Kernel Disk A’

  13. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A A’ Unix Kernel Vice Network User Program Venus close(A) Unix Kernel Disk A’

  14. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A’ Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk A’

  15. Andrew File System (contd.) • Design goal: Scalability • As much work as possible is performed by Venus • Solution: Caching • Venus caches files from Vice • Venus contacts Vice only when a file is opened or closed • Reading and writing are performed directly on the cached copy Disks A’ Unix Kernel Vice Network User Program Venus open(A) Unix Kernel Disk A’

  16. Outline • Building a prototype • Qualitative Observation • Performance Evaluation • Changes for performance • Performance Evaluation • Comparison with a Remote-Open File System • Change for operability • Conclusion

  17. Outline • Building a prototype • Qualitative Observation • Performance Evaluation • Changes for performance • Performance Evaluation • Comparison with a Remote-Open File System • Change for operability • Conclusion

  18. The Prototype • Preserve directory hierarchy • Each server contained a directory hierarchy mirroring the structure of the Vice files a/ a1 a2 b/ b1 b2 c/ c1/ c11 c12 c2 Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 Venus File Cache Status Cache ....

  19. The Prototype (contd.) • Preserve directory hierarchy • Each server contained a directory hierarchy mirroring the structure of the Vice files a/ a1 a2 b/ b1 b2 c/ c1/ c11 c12 c2 Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 .admin directories: contain Vice file status information Venus File Cache Stub directories: represent portions located on other servers Status Cache ....

  20. The Prototype (contd.) • Preserve directory hierarchy • Vice-Venus interface name files by their full pathname Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 Venus a/a1 File Cache Status Cache ....

  21. The Prototype (contd.) • Dedicated processes • One process for each client Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 Venus File Cache Status Cache ....

  22. The Prototype (contd.) • Use two caches • One for files, and the other for status information about files Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 Venus File Cache Status Cache ....

  23. The Prototype (contd.) • Verify cached timestamp for each open • Before using a cached file, Venus verify its timestamp with that on the server Server Disks Client Disk Vice .admin/ a/ a1 a2 b/ → Server 2 c/ c1/ → Server 3 c2 Venus a/a1(5)? File Cache a/a1 (5) OK Status Cache ....

  24. Qualitative Observation • stat primitive • Testing the presence of files, obtaining status information, ... • Programs using stat run much slower than the authors expected • Each stat involve a cache validity check • Dedicated processes • Excessive context switching overhead • High virtual memory paging demands • File location • Difficult to move users’ directories between servers

  25. Performance Evaluation • Experience: the prototype was used in CMU • The authors + 400 other users • 100 workstations and 6 servers • Benchmark • A command script for source files • MakeDir→ Copy → ScanDir→ ReadAll→ Make • Multiple clients (load units) run the benchmark simultaneously

  26. Performance Evaluation (contd.) • Cache hit ratio • File cache: 81% • Status cache: 82%

  27. Performance Evaluation (contd.) • Distribution of Vice calls in prototype on average

  28. Performance Evaluation (contd.) • Server usage • CPU utilizations are up to 40% • Disk utilizations are less than 15% • Server loads are imbalanced

  29. Performance Evaluation (contd.) • Benchmark performance • Time for TestAuth rises rapidly beyond a load 5

  30. Performance Evaluation (contd.) • Caches work well! • We need to • Reduce the frequency of cache validity check • Reduce the number of server processes • Require workstations rather than the servers to do pathname traversals • Balance server usage by reassigning users

  31. Outline • Building a prototype • Qualitative Observation • Performance Evaluation • Changes for performance • Performance Evaluation • Comparison with a Remote-Open File System • Change for operability • Conclusion

  32. Changes for Performance • Cache management: use callback • Vice notifies Venus if a cached file or directory is modified by other workstation • Cache entries are valid unless otherwise notified • Verification is not needed • Each Vice and Venus maintain callback state information

  33. Changes for Performance (contd.) • Name resolution and storage representation • CPU overhead is caused by namei routine • Maps a pathname to an inode • Indicate files by fidsinstead of pathnames • Volume is a collection of files located on one server • Contains multiple vnodes which indicate files in the volume • Uniquifier allows reuse of vnode numbers Volume number Vnode number Uniquifier 32bit 32bit 32bit

  34. Changes for Performance (contd.) • Name resolution and storage representation Clients Volume number Vnode number Uniquifier Servers

  35. Changes for Performance (contd.) • Name resolution and storage representation Clients Volume number Vnode number Uniquifier Servers Volume location database

  36. Changes for Performance (contd.) • Name resolution and storage representation Clients Volume number Vnode number Uniquifier Servers Vnode inode Volume location database Vnode lookup table

  37. Changes for Performance (contd.) • Name resolution and storage representation • Indicate files by fidsinstead of pathnames • Each entry in a directory maps a component of a pathname to a fid • Venus performs the logical equivalent of a namei operation

  38. Changes for Performance (contd.) • Server process structure • Use lightweight processes (LWPs) instead of processes • LWPs are not dedicated to a single client

  39. Performance Evaluation • Scalability

  40. Performance Evaluation (contd.) • Server utilization during benchmark

  41. Outline • Building a prototype • Qualitative Observation • Performance Evaluation • Changes for performance • Performance Evaluation • Comparison with a Remote-Open File System • Change for operability • Conclusion

  42. Comparison with A Remote-Open File System • The Caching of Andrew File System • Locality makes caching attractive • Whole-file transfer approach contacts servers only on opens and closes • Most files in a 4.2BSD environment are read in their entirety • Disk caches retain their entries across reboots • Caching of entire files simplifies cache management

  43. Comparison with A Remote-Open File System • The Caching of Andrew File System – Drawbacks • Requiring local disks • Large file handling • Strict emulation of 4.2BSD concurrent read/write semantics is impossible

  44. Comparison with A Remote-Open File System • Remote Open • The data in a file are not fetched en masse • Instead the remote site potentially participates in each individual read an write operation • File is actually opened on the remote site rather than the local site • NFS

  45. Comparison with A Remote-Open File System

  46. Comparison with A Remote-Open File System • Serious functional problems with NFS at high loads Network Traffic for Andrew and NFS

  47. Comparison with A Remote-Open File System

  48. Comparison with A Remote-Open File System • Advantage of remote-open file system • Low latency Latency of Andrew and NFS

  49. Outline • Building a prototype • Qualitative Observation • Performance Evaluation • Changes for performance • Performance Evaluation • Comparison with a Remote-Open File System • Change for operability • Conclusion

  50. Change for Operability • Volume • A collection of files forming a partial subtree of the Vice name space • Glued together at mount points • Operational Transparency Servers Volume 1 Volume 3 Mounted Volume Volume 2

More Related