1 / 33

Self-Stabilizing Distributed File System

This paper discusses the design and implementation of a self-stabilizing distributed file system, focusing on performance, fault tolerance, and placing files closer to users.

rochelleg
Download Presentation

Self-Stabilizing Distributed File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM

  2. DFS Motivation • Performance • Fault tolerance • Placing files closer to users

  3. Related Work • File systems • NFS – network file system protocol • AFS – Andrew file system – CMU(1988) • Coda - CMU (1998) • Intermezzo – Peter J. Braam, CMU • Peer to peer (2000) • Global storage: OceanStore – Berkeley • Server less: Microsoft Farsite.

  4. Talk Overview • Self-stabilization • Design • Algorithms • File system implementation • Future work

  5. Self Stabilization • Self healing • Adaptiveness • Automatic recovery • Autonomic computing Self Stabilization Dijkstra 1974

  6. Self Stabilization A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults. The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour. E.G., Self-stabilization / S. Dolev.

  7. Self Stabilization Motivation • The combination and type of faults cannot be totally anticipated in on-going systems • Any on-going system must be self stabilizing (or manually monitored) • Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults

  8. Design

  9. Design • Replication servers joined to a spanning tree • A spanning tree is constructed • File updates are propagated using self-stabilizing -synchronizer

  10. Design (Cont’) • Clients join the replication tree and form a caching tree • File leases • Global locking

  11. Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency

  12. Leader Election • A single leader coordinates construction • If non exists, a server becomes a leader • If more than one exists, one survives • Message are periodically broadcasted

  13. Leader Election Algorithm • Every T1 do: • If (p = leader) then send-multicast(‘I’m a leader’) • Leader-exists = true • Every T1+Td do: • If (not leader-exists) then leader = p • Leader-exists = false • Upon arrival of message do: • If (p.volume=volume) then • If (p=leader) then leader = min(leader,sender) • Else leader = sender • Leader-exists = true

  14. Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency

  15. Induced Graph Example

  16. Update Algorithm • Collect routing tables from all neighbours in the induced graph • Elect a manager (local leader) for the tree, a server with the minimal ID • Build a distributed BFS spanning tree • The algorithm converges

  17. Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency

  18. Optimising Communication Costs • Goal: find the minimal  radius that keeps connectivity • Increase  by a factor of 2 • Run a 2nd instance of update with <  • Searching for  using binary search

  19. Tree Structure

  20. Caching Tree • Extends the replication tree • The update algorithm constructs both • Servers execute two instances • Caches execute one instance

  21. Combined Spanning Tree

  22. Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency

  23. Synchronization Mechanism • Provide reliable command and timing • Propagate commands between servers • Collect and distribute information

  24. Replication Consistency • Verifies signatures • Multiple signature – a conflict • Conflict resolution • Broadcast resolved signature

  25. Locking Table • A (unified) global lock table • Lock are requested • Leader resolves multiple locks • Lock are removed by cancelling the locks request

  26. File System Implementation

  27. Lock file Get signature Get a copy Use local copy Accessing a File Update? Yes No Cached? No Yes

  28. Send new signature Confirm signature Closing a File Update? Yes No

  29. Lock file Execute command Wait confirmation Meta Access • Globally processed • Blocked until a lock is obtained

  30. Network Communication SyncDaemon: Cache manager & Server Application New implementation: open, close, lstat, mkdir, etc … Linux Based bgRFS User Level Linux system calls Up calls System Calls

  31. Future Work • Kernel VFS module. • Communication improvements: • Reducing update messages • Using timers with -synchronizer • Performance enhancements • Integrating disconnected operations • Conflict resolution algorithms

  32. Credits Faculty Prof Shlomi Dolev dolev@cs.bgu.ac.il Graduate Students Ronen I. Kat kat@cs.bgu.ac.il System Engeenier Albina Budker albinabu@cs.bgu.ac.il Undergraduate Students: Amir Livneh livneha@cs.bgu.ac.il Itay Granik granik@cs.bgu.ac.il Boris Lansky lanskyb@cs.bgu.ac.il Naama Shmuel shmueln@cs.bgu.ac.il Moshe Shish shishm@cs.bgu.ac.il Guy Erlich erlichg@cs.bgu.ac.il Avital Chohen avitalco@cs.bgu.ac.il Yael Biran birany@cs.bgu.ac.il Tamir Fridman tamirf@cs.bgu.ac.il Shiraz Bernard shirazb@cs.bgu.ac.il Zvika Ferents ferents@cs.bgu.ac.il Roy Feintuch feintuch@cs.bgu.ac.il Chen Shalev shalevc@cs.bgu.ac.il Shay Kraim kraim@cs.bgu.ac.il Alex Hayuit

  33. Visit us at • www.cs.bgu.ac.il/~bgrfs

More Related