1 / 18

The Coda File System

The Coda File System. Jeff Chheng Jun Du. Overview of Coda. Distributed file system Designed for scalability, security, and high availability Descendant of version 2 of Andrew File System (AFS), so follows same organization Virtue  Venus  Vice. Overview of Coda.

yestin
Download Presentation

The Coda File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Coda File System Jeff Chheng Jun Du

  2. Overview of Coda • Distributed file system • Designed for scalability, security, and high availability • Descendant of version 2 of Andrew File System (AFS), so follows same organization • Virtue  Venus  Vice

  3. Overview of Coda • Virtue (client)  Venus (process)  Vice (file server)

  4. Communication • Remote procedure calls with RPC2 • More reliable than other RPC systems • Supports “side effects” – an application-specific protocol • Multicasting for invalidation: in series vs. in parallel

  5. Processes • Clients represented by Venus processes • Servers represented by Vice processes • Both processes are organized as a collection of concurrent threads • Threads are non-preemptive

  6. Naming • Files are grouped into volumes • A volume is like a Unix disk partition, but with smaller granularity • Volumes can be mounted • Naming inherited from server’s name space

  7. File Identification • Files are copied and moved across multiple servers • ID is needed to track file to physical location • Replicated Volume Identifier (RVID) for logical volumes • Volume Identifier (VID) for physical volumes

  8. Synchronization • Many DFS (e.g., AFS) support session semantics • Coda attempts to support transactional semantics • Attempts to solve the problem in large DFS where some or all file servers are temporarily unavailable

  9. Sharing Files • When client opens file, file is transferred to client’s machine • When file is opened for writing, no other clients may open file • When file is opened for reading, others can open for reading or writing

  10. Client Caching • Clients always cache entire files • Cache coherence maintained with callbacks • Servers record a callback promise for clients • Updating a file breaks the promise for other clients • Use promise to determine if cache needs updating

  11. Server Replication • Volume Storage Group (VSG): collection of servers with copy of volume • Accessible VSG (AVSG): servers in a VSG a client can contact • If AVSG is empty, client is considered disconnected • Client receives from one member in AVSG, updates in parallel to all members

  12. Server Replication Problem • What happens when two clients access two different AVSGs for the same file? • Use optimistic strategy for replication • Inconsistency detected and resolved with Coda version vector • Conflict resolution can be automated, but might require user intervention

  13. Working While Disconnected • Unlike NFS, client will simply use local copy when disconnected • Closing file when d/c always works • Modifications are transferred to server when connection is reestablished • Mostly automatic, may need intervention • In practice, write-sharing is rare

  14. Security • Mutual authentication with secret-key cryptosystem • Setting up a secure channel requires a secret token • Access is granted to disconnected clients

  15. Access Control • Access control associated with directories (but not subdirectories) • Operations under access control: read, write, lookup, insert, delete, and administer • Execution never happens server-side, only client-side, so no permissions for it • Coda maintains info on users and groups • Negative rights possible

  16. Drawbacks • Client-side (Venus) resource exhaustion • File cache full of modified files • RVM space becomes full • Not entirely scalable yet beyond 20-30 users and a few servers • Limited stability with systems containing terabytes of data

  17. Improvements • Increase cache and RVM size or allow them to be stored in removable media • Compress file cache and RVM contents • Allow users to selectively back out of updates

  18. Questions?

More Related