350 likes | 439 Views
Distributed File Systems. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Motivation. ICT allows for distributed work Users work timely and spatially separated
E N D
Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook
Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook
Motivation • ICT allows for distributed work • Users work timely and spatially separated • They need access to common data collections Provided by distributed file systems (DFS) • Distributed work leads to new business models • 24/7 customer service • Analysis of worldwide financial information (stock prices etc.) Economic relevance! • Different DFSs were developed in the past Structured discussion necessary
Agenda • Motivation • Distributed file system basics • Case studies • Summary and outlook
Basics – Storage fundamentals • „Storage“: Fundamendalabstraction in computing • Data encapsulated in objects • Explicit creationanddeletion • Unaffectedbysystemfailures • „File system“: Refinementofabstraction • Three different usagedimensions • Single user vs. multiple users • Single-thread vs. multi-thread OS • Single site vs. multiple sites
Basics – Requirementsfor DFS (1/2) • Transparency • User must beunawareofinternalseparationofcomponents • Access, performance, location, scalingtransparency • Availability • System shouldbe fault tolerant • Concurrentupdates • Simultaneousaccessto a singleresource • Replication • File maybepresentat different locations • Shares loadbetweenservers, enhances fault tolerancy
Basics – Requirements for DFS (2/2) • Hardware and software heterogeneity • Support for various platforms • Consistency • Data integrity has to be maintained • Security • Access control, user authentication, confidentiality • Efficiency • Performance should be comparable to local file systems
Basics – Abstract file service model (1/3) Source: [CDK01], p. 318
Basics – Abstract file service model (2/3) Source: [CDK01], p. 319-322
Basics – Abstract file service model (3/3) • Access control • Server-sideuserauthorisation • Access rightschecked upon directorylookuporeveryrequest • Hierarchicalfilestructure • Realisedwithintheclientmodule • Directoriesmaystorereferencestootherdirectories • File groups • Set offilesthatcanbemovedbetweenservers • Similarto a filesystem
Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook
NFS – History • 198?: NFSv1 • Developed at Sun Microsystems, unreleased • 1984: NFSv2 • Developed at Sun Microsystems • First released version, widely accepted • Supports files < 4GB, synchronous writes • 1992: NFSv3 • Developed by a group of researchers • Overcomes drawbacks (file size, asynchronous writes) • 2002: NFSv4 • Enhanced security, user authentication • Better Windows support
NFS – General description (1/2) Source: [CDK01], p. 324
NFS – General description (2/2) • Stateless protocol • Server does not maintain client states • Client requests are blocking (Exception: asynchronous write) • User authentication • Default: UNIX user ID (insecure!) • Optional: Kerberos, DES • Caching • Read cache: Yes • Write cache: No! • Server file system • Not restricted, should support unique file IDs
NFS – Abstract model (2/2) • Operations • Similarto UNIX filesystemcalls • All abstractoperationscanberepresented • Access control • Checked upon everyrequest • Hierarchicalfilesystem • Realisedwithintheclientmodule • File groups • Not supported, onlymanualmovementoffiles
NFS – Requirements • Transparency • Availability • Concurrent updates • Replication • Heterogeneity • Consistency • Security • Efficiency • • • • • • • •
Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook
AFS – History • 1982: Initial version • Developed at Carnegie Mellon University (CMU), Pittsburgh • Part of the Andrew distributed computing environment • Provides support for teaching and research • 1989: Spin-off • Development outsourced to Transarc Inc. • 1994: Transarc acquired by IBM • All rights owned by IBM • 2000: Open-source • Code was released under an open source license • Since then: continuous development
AFS – General description (2/3) Cached? No!
AFS – General description (3/3) • Caching • „Callback promises“ • Workstations arenotifiedwhencachedfileschange • Statefulprotocol • Server maintainsclientstates • Problematicwhenclientfails • User authentication • Kerberos • Server filesystem • Not restricted, shouldsupportuniquefile IDs
AFS – Abstract model (2/2) • Operations • Differ from abstract model • Some operations combined, callback promises added • Access control • Rights checked upon every request • Extended access lists per directory • Hierarchical file system • Realized within the client module • File groups • File idenitfier contains link to file group • Location database maps file groups to servers
AFS – Requirements • Transparency • Availability • Concurrentupdates • Replication • Heterogeneity • Consistency • Security • Efficiency • • • • • • • •
Agenda Motivation Distributed file system basics Case studies Network File System (NFS) Andrew File System (AFS) Lustre Summary and outlook
Lustre (1/3) • „Lustre“: Linux Cluster • File system especially suited for clusters • Easily handles thousands of clients and servers • Uses object-based storage • Objects offer methods for data access, attributes, policies • High-level abstraction • Lower performance than block-based storage • Three system roles • Object Storage Targets (OST) • Metadata Servers (MDS) • Clients
Lustre (2/3) Metadata Servers (MDS) File operations, locking Clients Directory metadata Object Storage Targets (OST) Recovery, file status Source: [BS02], p. 51
Lustre (3/3) • Lustre partly follows abstract model • Separation of directory and flat file service • File attributes managed by OSTs • Hierarchical file systems • Realised within the client module • High availability • Heavy use of redundancy • Caching of metadata
Agenda Motivation Distributed file system basics Case studies Summary and outlook
Summary and outlook • Abstract file service model • Developed to meet many requirements for DFSs • Different implementations • NFS: Stateless, concurrency control • AFS: Stateful, heavy use of caching, better performance • Other approach: Lustre • Modularised approach, especially suited for clusters • Future developments • Large-scale environments • Cloud computing • Issues: Data security, privacy
Thank you for your attention! Anyquestions?
Literature • [BS02] Peter J. Braam, Philip Schwan: Lustre: The intergalactic file system, Proceedings of the 2003 Ottawa Linux Symposium, pp. 50–54, 2002. • [CDK01] George Coulouris, Jean Dollimore, Tim Kindberg: Distributed Systems, Concepts and Design, 3rd. ed., Addison-Wesley, 2001. • [Kir06] Olaf Kirch: Why NFS Sucks, Proceedings of the Linux Symposium, 2nd. ed., pp. 51–63, 2006. • [MSC+ 86] James H. Morris, Mahadev Satyanarayanan, Michael H. Conner, John H. Howard, David S. H. Rosenthal, F. Donelson Smith: Andrew: A distributed personal computing environment, Commununications of the ACM, 29(3), pp. 184–201, Association for Computing Machinery, 1986. • [PJS+ 94] Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, David Hitz: NFS Version 3: Design and Implementation, Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–151, 1994. • [Sat89] Mahadev Satyanarayanan: Distributed file systems, Distributed systems, S. Mullender (ed.), pp. 149–188, ACM Press, 1989. • [Sch03] Philip Schwan: Lustre: Building a file system for 1000-node clusters, Proceedings of the 2003 Ottawa Linux Symposium, pp. 380–386, 2003. • [Tan03] Andrew S. Tanenbaum: Moderne Betriebssysteme, 2nd. ed., Prentice Hall, 2003. • [Tv07] Andrew S. Tanenbaum, Marten van Steen: Distributed Systems: Principles and Paradigmsva, 2nd. ed., Prentice Hall, 2007.