Understanding Andrew File System Operations in Advanced Operating Systems

Advanced Operating Systems - Spring 2009Lecture 21 – Monday April 6st, 2009 • Dan C. Marinescu • Email: dcm@cs.ucf.edu • Office: HEC 439 B. • Office hours: M, Wd 3 – 4:30 PM. • TA: Chen Yu • Email: yuchen@cs.ucf.edu • Office: HEC 354. • Office hours: M, Wd 1.00 – 3:00 PM.

Last, Current, Next Lecture • Last time: Distributed File System • Today • Andrew File System • Network and Distributed Operating Systems • Multiple Access Networks • Next time: • Interconnection Networks

Stateless vs. stateful servers • Stateless service: • longer request messages • slower request processing • additional constraints imposed on DFS design • Some environments require stateful service • A server employing server-initiated cache validation cannot provide stateless service, since it maintains a record of which files are cached by which clients • UNIX use of file descriptors and implicit offsets is inherently stateful; servers must maintain tables to map the file descriptors to inodes, and store the current offset within a file

File Replication • Replicas reside on failure-independent machines • Improves availability and can shorten service time • Naming scheme maps a replicated file name to a particular replica • Existence of replicas should be invisible to higher levels • Replicas must be distinguished from one another by different lower-level names • Updates  an update to any replica must be reflected on all replicas • Demand replication – reading a nonlocal replica causes it to be cached locally, thereby generating a new nonprimary replica.

Andrew File System (AFS) • AFS tries address issues such as: • uniform name space • location-independent file sharing • client-side caching (with cache consistency) • secure authentication (via Kerberos) • server-side caching (via replicas) • high availability • scalability can span 5,000 workstations • History • A distributed computing environment developed since 1983 at CMU • Purchased by IBM and released as Transarc DFS, • Now open source  OpenAFS

AFS (cont’d) • Clusters • clients and servers form clusters interconnected by a backbone LAN • a cluster  several workstations and a cluster server connected to the backbone by a router • Clients see a partitioned space of file names: • a local name space and • a shared name space • the local name space is the root file system of a workstation, from which the shared name space descends • Servers collectively are responsible for the storage and management of the shared name space. • Opening a file causes it to be cached, in its entirety, on the local disk

AFS (cont’d) • Vice dedicated servers present the shared name space to the clients as an homogeneous, identical, and location transparent file hierarchy. • Workstations • run the Virtue protocol to communicate with Vice • are required to have local disks where they store their local name space • Andrew’s volumes small units associated with the files of a single client • fid (96 bits) identifies a Vice file or directory; three components: • volume number • vnode number – index into an array containing the inodes of files in a single volume • uniquifier – allows reuse of vnode numbers, thereby keeping certain data structures, compact • Fids are location transparent; therefore, file movements from server to server do not invalidate cached directory contents • Location information • kept on a volume basis • replicated on each server

File Operations in AFS • Andrew caches entire files form servers. • A client workstation interacts with Vice servers only during opening and closing of files • A component called Venus • caches files from Vice when they are opened, and stores modified copies of files back when they are closed • Caches contents of directories and symbolic links, for path-name translation • Venus manages two separate caches: • Status cache  kept in virtual memory to allow rapid servicing of stat (file status returning) system calls • Data cache  on the local disk; the UNIX I/O buffering mechanism does some caching that are transparent to Venus • LRU algorithm used to keep each of them bounded in size • Exceptions to the caching policy are modifications to directories that are made directly on the server responsibility for that directory • Reading and writing to a file  done by the kernel without Venus intervention on the cached copy

AFS implementation • Client processes are interfaced to a UNIX kernel with the usual set of system calls • Venus carries out path-name translation component by component • The UNIX file system is used as a low-level storage system for both servers and clients • The client cache is a local directory on the workstation’s disk • Both Venus and server processes access UNIX files directly by their inodes to avoid the expensive path name-to-inode translation routine

Distributed systems • Distributed systemcollection of heterogeneous systems (different processor architecture, OS, libraries, applications) linked to each other by an interconnection network. • Communication  message passing. • Advantages of distributed systems • Resource sharing  better utilization of resources • Fault-tolerance  systems fail independently, increase redundancy • Scalability  the system can grow in time • Supports collaborative environments in enterprise computing, engineering (e.g., CAD systems), science (e.g., GRID), etc. • Problems • Resource management more difficult. • Hard to manage autonomous systems • New services necessary e.g., resource discovery • Security • Harder to construct distributed applications

Distributed system architecture • Service-oriented architectures  set of services provided by autonomous service providers. Based upon • client-server paradigm and • request-response communication • GRID, semantic Web • User-Coordinator-Executor architecture  multiple sites provide computing resources ; the coordinator acts as an agent of the user and starts applications at participating sites and then monitors the execution. Potential use in high performance computing. • Peer-to-peer architectures  the systems function simultaneously as client and server

Service oriented-distributed systems

Autonomous vs non-autonomous systems • Autonomous systems Resources of individual systems controlled by the local operating systems. • Often in distinct administrative domains. • Open system  new resources added or removed continually • Scalable. No one tries to maintain common state • Non-autonomous systemsResources controlled by a • Network operating system Users are aware of multiplicity of machines. • Access to resources of various systems done explicitly by: • Remote logging into the appropriate remote machine (telnet, ssh) • Remote Desktop (Microsoft Windows) • Transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) mechanism • Distributed Operating System  Users not aware of multiplicity of machines • Access to remote resources similar to access to local resources • Common state. • Data Migration – transfer data • Computation Migration – transfer the computation, rather than data

Process migration • Possible only if a homogeneous architecture. • Load balancing. • Synchronization across the network. • Coordination

Understanding Andrew File System Operations in Advanced Operating Systems

Understanding Andrew File System Operations in Advanced Operating Systems

Presentation Transcript

Advanced Operating Systems

Advanced sensor systems ET8008 Fall 2009 hh.se/et8008

CS 414 – Multimedia Systems Design Lecture 29 – Media Server (Part 4)

Operating System Organization

15-446 Distributed Systems Spring 2009

EE 6331, Spring, 2009 Advanced Telecommunication

Operating systems, Lecture 1

15-446 Distributed Systems Spring 2009

15-446 Distributed Systems Spring 2009

CSCI 664 Advanced Operating Systems

Operating Systems CMPSCI 377 Lecture 23: Advanced File Systems

Advanced Operating Systems

Introduction to Computer Systems 15-213/18-243, spring 2009 11 th Lecture, Feb. 17 th

CSI 101 Elements of Computing Spring 2009

Advanced Operating Systems Lecture notes

Advanced Operating Systems

CS 3204 Operating Systems

Introduction to Computer Systems 15-213/18-243, spring 2009 8 th Lecture, Feb. 5 th

16.317 Microprocessor Systems Design I

Advanced Operating Systems - Spring 2009 Lecture 17 – March 23, 2009

Advanced Operating Systems Lecture notes gost.isi/555

CS 525 Advanced Topics in Distributed Systems Spring 08