140 likes | 154 Views
Explore Andrew File System (AFS) functionalities, stateless vs. stateful servers, file replication, clusters, client-side caching, secure authentication, and file operations in this advanced operating systems lecture.
E N D
Advanced Operating Systems - Spring 2009Lecture 21 – Monday April 6st, 2009 • Dan C. Marinescu • Email: dcm@cs.ucf.edu • Office: HEC 439 B. • Office hours: M, Wd 3 – 4:30 PM. • TA: Chen Yu • Email: yuchen@cs.ucf.edu • Office: HEC 354. • Office hours: M, Wd 1.00 – 3:00 PM.
Last, Current, Next Lecture • Last time: Distributed File System • Today • Andrew File System • Network and Distributed Operating Systems • Multiple Access Networks • Next time: • Interconnection Networks
Stateless vs. stateful servers • Stateless service: • longer request messages • slower request processing • additional constraints imposed on DFS design • Some environments require stateful service • A server employing server-initiated cache validation cannot provide stateless service, since it maintains a record of which files are cached by which clients • UNIX use of file descriptors and implicit offsets is inherently stateful; servers must maintain tables to map the file descriptors to inodes, and store the current offset within a file
File Replication • Replicas reside on failure-independent machines • Improves availability and can shorten service time • Naming scheme maps a replicated file name to a particular replica • Existence of replicas should be invisible to higher levels • Replicas must be distinguished from one another by different lower-level names • Updates an update to any replica must be reflected on all replicas • Demand replication – reading a nonlocal replica causes it to be cached locally, thereby generating a new nonprimary replica.
Andrew File System (AFS) • AFS tries address issues such as: • uniform name space • location-independent file sharing • client-side caching (with cache consistency) • secure authentication (via Kerberos) • server-side caching (via replicas) • high availability • scalability can span 5,000 workstations • History • A distributed computing environment developed since 1983 at CMU • Purchased by IBM and released as Transarc DFS, • Now open source OpenAFS
AFS (cont’d) • Clusters • clients and servers form clusters interconnected by a backbone LAN • a cluster several workstations and a cluster server connected to the backbone by a router • Clients see a partitioned space of file names: • a local name space and • a shared name space • the local name space is the root file system of a workstation, from which the shared name space descends • Servers collectively are responsible for the storage and management of the shared name space. • Opening a file causes it to be cached, in its entirety, on the local disk
AFS (cont’d) • Vice dedicated servers present the shared name space to the clients as an homogeneous, identical, and location transparent file hierarchy. • Workstations • run the Virtue protocol to communicate with Vice • are required to have local disks where they store their local name space • Andrew’s volumes small units associated with the files of a single client • fid (96 bits) identifies a Vice file or directory; three components: • volume number • vnode number – index into an array containing the inodes of files in a single volume • uniquifier – allows reuse of vnode numbers, thereby keeping certain data structures, compact • Fids are location transparent; therefore, file movements from server to server do not invalidate cached directory contents • Location information • kept on a volume basis • replicated on each server
File Operations in AFS • Andrew caches entire files form servers. • A client workstation interacts with Vice servers only during opening and closing of files • A component called Venus • caches files from Vice when they are opened, and stores modified copies of files back when they are closed • Caches contents of directories and symbolic links, for path-name translation • Venus manages two separate caches: • Status cache kept in virtual memory to allow rapid servicing of stat (file status returning) system calls • Data cache on the local disk; the UNIX I/O buffering mechanism does some caching that are transparent to Venus • LRU algorithm used to keep each of them bounded in size • Exceptions to the caching policy are modifications to directories that are made directly on the server responsibility for that directory • Reading and writing to a file done by the kernel without Venus intervention on the cached copy
AFS implementation • Client processes are interfaced to a UNIX kernel with the usual set of system calls • Venus carries out path-name translation component by component • The UNIX file system is used as a low-level storage system for both servers and clients • The client cache is a local directory on the workstation’s disk • Both Venus and server processes access UNIX files directly by their inodes to avoid the expensive path name-to-inode translation routine
Distributed systems • Distributed systemcollection of heterogeneous systems (different processor architecture, OS, libraries, applications) linked to each other by an interconnection network. • Communication message passing. • Advantages of distributed systems • Resource sharing better utilization of resources • Fault-tolerance systems fail independently, increase redundancy • Scalability the system can grow in time • Supports collaborative environments in enterprise computing, engineering (e.g., CAD systems), science (e.g., GRID), etc. • Problems • Resource management more difficult. • Hard to manage autonomous systems • New services necessary e.g., resource discovery • Security • Harder to construct distributed applications
Distributed system architecture • Service-oriented architectures set of services provided by autonomous service providers. Based upon • client-server paradigm and • request-response communication • GRID, semantic Web • User-Coordinator-Executor architecture multiple sites provide computing resources ; the coordinator acts as an agent of the user and starts applications at participating sites and then monitors the execution. Potential use in high performance computing. • Peer-to-peer architectures the systems function simultaneously as client and server
Autonomous vs non-autonomous systems • Autonomous systems Resources of individual systems controlled by the local operating systems. • Often in distinct administrative domains. • Open system new resources added or removed continually • Scalable. No one tries to maintain common state • Non-autonomous systemsResources controlled by a • Network operating system Users are aware of multiplicity of machines. • Access to resources of various systems done explicitly by: • Remote logging into the appropriate remote machine (telnet, ssh) • Remote Desktop (Microsoft Windows) • Transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) mechanism • Distributed Operating System Users not aware of multiplicity of machines • Access to remote resources similar to access to local resources • Common state. • Data Migration – transfer data • Computation Migration – transfer the computation, rather than data
Process migration • Possible only if a homogeneous architecture. • Load balancing. • Synchronization across the network. • Coordination