1 / 35

Distributed File Systems

Distributed File Systems. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Motivation. ICT allows for distributed work Users work timely and spatially separated

Download Presentation

Distributed File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed File Systems

  2. Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook

  3. Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook

  4. Motivation • ICT allows for distributed work • Users work timely and spatially separated • They need access to common data collections  Provided by distributed file systems (DFS) • Distributed work leads to new business models • 24/7 customer service • Analysis of worldwide financial information (stock prices etc.)  Economic relevance! • Different DFSs were developed in the past  Structured discussion necessary

  5. Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook

  6. Basics – Storage fundamentals • „Storage“: Fundamendalabstraction in computing • Data encapsulated in objects • Explicit creationanddeletion • Unaffectedbysystemfailures • „File system“: Refinementofabstraction • Three different usagedimensions • Single user vs. multiple users • Single-thread vs. multi-thread OS • Single site vs. multiple sites

  7. Basics – Requirementsfor DFS (1/2) • Transparency • User must beunawareofinternalseparationofcomponents • Access, performance, location, scalingtransparency • Availability • System shouldbe fault tolerant • Concurrentupdates • Simultaneousaccessto a singleresource • Replication • File maybepresentat different locations • Shares loadbetweenservers, enhances fault tolerancy

  8. Basics – Requirements for DFS (2/2) • Hardware and software heterogeneity • Support for various platforms • Consistency • Data integrity has to be maintained • Security • Access control, user authentication, confidentiality • Efficiency • Performance should be comparable to local file systems

  9. Basics – Abstract file service model (1/3) Source: [CDK01], p. 318

  10. Basics – Abstract file service model (2/3) Source: [CDK01], p. 319-322

  11. Basics – Abstract file service model (3/3) • Access control • Server-sideuserauthorisation • Access rightschecked upon directorylookuporeveryrequest • Hierarchicalfilestructure • Realisedwithintheclientmodule • Directoriesmaystorereferencestootherdirectories • File groups • Set offilesthatcanbemovedbetweenservers • Similarto a filesystem

  12. Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook

  13. NFS – History • 198?: NFSv1 • Developed at Sun Microsystems, unreleased • 1984: NFSv2 • Developed at Sun Microsystems • First released version, widely accepted • Supports files < 4GB, synchronous writes • 1992: NFSv3 • Developed by a group of researchers • Overcomes drawbacks (file size, asynchronous writes) • 2002: NFSv4 • Enhanced security, user authentication • Better Windows support

  14. NFS – General description (1/2) Source: [CDK01], p. 324

  15. NFS – General description (2/2) • Stateless protocol • Server does not maintain client states • Client requests are blocking (Exception: asynchronous write) • User authentication • Default: UNIX user ID (insecure!) • Optional: Kerberos, DES • Caching • Read cache: Yes • Write cache: No! • Server file system • Not restricted, should support unique file IDs

  16. NFS – Abstract model (1/2) vs.

  17. NFS – Abstract model (2/2) • Operations • Similarto UNIX filesystemcalls • All abstractoperationscanberepresented • Access control • Checked upon everyrequest • Hierarchicalfilesystem • Realisedwithintheclientmodule • File groups • Not supported, onlymanualmovementoffiles

  18. NFS – Requirements • Transparency • Availability • Concurrent updates • Replication • Heterogeneity • Consistency • Security • Efficiency •  •  •  •  •  •  •  • 

  19. Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook

  20. AFS – History • 1982: Initial version • Developed at Carnegie Mellon University (CMU), Pittsburgh • Part of the Andrew distributed computing environment • Provides support for teaching and research • 1989: Spin-off • Development outsourced to Transarc Inc. • 1994: Transarc acquired by IBM • All rights owned by IBM • 2000: Open-source • Code was released under an open source license • Since then: continuous development

  21. AFS – General description (1/3)

  22. AFS – Name spaces

  23. AFS – General description (2/3) Cached? No!

  24. AFS – General description (3/3) • Caching • „Callback promises“ • Workstations arenotifiedwhencachedfileschange • Statefulprotocol • Server maintainsclientstates • Problematicwhenclientfails • User authentication • Kerberos • Server filesystem • Not restricted, shouldsupportuniquefile IDs

  25. AFS – Abstract model (1/2) vs.

  26. AFS – Abstract model (2/2) • Operations • Differ from abstract model • Some operations combined, callback promises added • Access control • Rights checked upon every request • Extended access lists per directory • Hierarchical file system • Realized within the client module • File groups • File idenitfier contains link to file group • Location database maps file groups to servers

  27. AFS – Requirements • Transparency • Availability • Concurrentupdates • Replication • Heterogeneity • Consistency • Security • Efficiency •  •  •  •  •  •  •  • 

  28. Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook

  29. Lustre (1/3) • „Lustre“: Linux Cluster • File system especially suited for clusters • Easily handles thousands of clients and servers • Uses object-based storage • Objects offer methods for data access, attributes, policies • High-level abstraction • Lower performance than block-based storage • Three system roles • Object Storage Targets (OST) • Metadata Servers (MDS) • Clients

  30. Lustre (2/3) Metadata Servers (MDS) File operations, locking Clients Directory metadata Object Storage Targets (OST) Recovery, file status Source: [BS02], p. 51

  31. Lustre (3/3) • Lustre partly follows abstract model • Separation of directory and flat file service • File attributes managed by OSTs • Hierarchical file systems • Realised within the client module • High availability • Heavy use of redundancy • Caching of metadata

  32. Agenda Motivation Distributed file system basics Case studies Summary and outlook

  33. Summary and outlook • Abstract file service model • Developed to meet many requirements for DFSs • Different implementations • NFS: Stateless, concurrency control • AFS: Stateful, heavy use of caching, better performance • Other approach: Lustre • Modularised approach, especially suited for clusters • Future developments • Large-scale environments • Cloud computing • Issues: Data security, privacy

  34. Thank you for your attention! Anyquestions?

  35. Literature • [BS02] Peter J. Braam, Philip Schwan: Lustre: The intergalactic file system, Proceedings of the 2003 Ottawa Linux Symposium, pp. 50–54, 2002. • [CDK01] George Coulouris, Jean Dollimore, Tim Kindberg: Distributed Systems, Concepts and Design, 3rd. ed., Addison-Wesley, 2001. • [Kir06] Olaf Kirch: Why NFS Sucks, Proceedings of the Linux Symposium, 2nd. ed., pp. 51–63, 2006. • [MSC+ 86] James H. Morris, Mahadev Satyanarayanan, Michael H. Conner, John H. Howard, David S. H. Rosenthal, F. Donelson Smith: Andrew: A distributed personal computing environment, Commununications of the ACM, 29(3), pp. 184–201, Association for Computing Machinery, 1986. • [PJS+ 94] Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, David Hitz: NFS Version 3: Design and Implementation, Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–151, 1994. • [Sat89] Mahadev Satyanarayanan: Distributed file systems, Distributed systems, S. Mullender (ed.), pp. 149–188, ACM Press, 1989. • [Sch03] Philip Schwan: Lustre: Building a file system for 1000-node clusters, Proceedings of the 2003 Ottawa Linux Symposium, pp. 380–386, 2003. • [Tan03] Andrew S. Tanenbaum: Moderne Betriebssysteme, 2nd. ed., Prentice Hall, 2003. • [Tv07] Andrew S. Tanenbaum, Marten van Steen: Distributed Systems: Principles and Paradigmsva, 2nd. ed., Prentice Hall, 2007.

More Related