330 likes | 346 Views
This paper discusses the design and implementation of a self-stabilizing distributed file system, focusing on performance, fault tolerance, and placing files closer to users.
E N D
Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM
DFS Motivation • Performance • Fault tolerance • Placing files closer to users
Related Work • File systems • NFS – network file system protocol • AFS – Andrew file system – CMU(1988) • Coda - CMU (1998) • Intermezzo – Peter J. Braam, CMU • Peer to peer (2000) • Global storage: OceanStore – Berkeley • Server less: Microsoft Farsite.
Talk Overview • Self-stabilization • Design • Algorithms • File system implementation • Future work
Self Stabilization • Self healing • Adaptiveness • Automatic recovery • Autonomic computing Self Stabilization Dijkstra 1974
Self Stabilization A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults. The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour. E.G., Self-stabilization / S. Dolev.
Self Stabilization Motivation • The combination and type of faults cannot be totally anticipated in on-going systems • Any on-going system must be self stabilizing (or manually monitored) • Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults
Design • Replication servers joined to a spanning tree • A spanning tree is constructed • File updates are propagated using self-stabilizing -synchronizer
Design (Cont’) • Clients join the replication tree and form a caching tree • File leases • Global locking
Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency
Leader Election • A single leader coordinates construction • If non exists, a server becomes a leader • If more than one exists, one survives • Message are periodically broadcasted
Leader Election Algorithm • Every T1 do: • If (p = leader) then send-multicast(‘I’m a leader’) • Leader-exists = true • Every T1+Td do: • If (not leader-exists) then leader = p • Leader-exists = false • Upon arrival of message do: • If (p.volume=volume) then • If (p=leader) then leader = min(leader,sender) • Else leader = sender • Leader-exists = true
Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency
Update Algorithm • Collect routing tables from all neighbours in the induced graph • Elect a manager (local leader) for the tree, a server with the minimal ID • Build a distributed BFS spanning tree • The algorithm converges
Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency
Optimising Communication Costs • Goal: find the minimal radius that keeps connectivity • Increase by a factor of 2 • Run a 2nd instance of update with < • Searching for using binary search
Caching Tree • Extends the replication tree • The update algorithm constructs both • Servers execute two instances • Caches execute one instance
Algorithms – Self Stabilizing • Electing a leader (leader election) • Collecting connectivity information • Optimising communication costs • -Synchronizer for file consistency
Synchronization Mechanism • Provide reliable command and timing • Propagate commands between servers • Collect and distribute information
Replication Consistency • Verifies signatures • Multiple signature – a conflict • Conflict resolution • Broadcast resolved signature
Locking Table • A (unified) global lock table • Lock are requested • Leader resolves multiple locks • Lock are removed by cancelling the locks request
Lock file Get signature Get a copy Use local copy Accessing a File Update? Yes No Cached? No Yes
Send new signature Confirm signature Closing a File Update? Yes No
Lock file Execute command Wait confirmation Meta Access • Globally processed • Blocked until a lock is obtained
Network Communication SyncDaemon: Cache manager & Server Application New implementation: open, close, lstat, mkdir, etc … Linux Based bgRFS User Level Linux system calls Up calls System Calls
Future Work • Kernel VFS module. • Communication improvements: • Reducing update messages • Using timers with -synchronizer • Performance enhancements • Integrating disconnected operations • Conflict resolution algorithms
Credits Faculty Prof Shlomi Dolev dolev@cs.bgu.ac.il Graduate Students Ronen I. Kat kat@cs.bgu.ac.il System Engeenier Albina Budker albinabu@cs.bgu.ac.il Undergraduate Students: Amir Livneh livneha@cs.bgu.ac.il Itay Granik granik@cs.bgu.ac.il Boris Lansky lanskyb@cs.bgu.ac.il Naama Shmuel shmueln@cs.bgu.ac.il Moshe Shish shishm@cs.bgu.ac.il Guy Erlich erlichg@cs.bgu.ac.il Avital Chohen avitalco@cs.bgu.ac.il Yael Biran birany@cs.bgu.ac.il Tamir Fridman tamirf@cs.bgu.ac.il Shiraz Bernard shirazb@cs.bgu.ac.il Zvika Ferents ferents@cs.bgu.ac.il Roy Feintuch feintuch@cs.bgu.ac.il Chen Shalev shalevc@cs.bgu.ac.il Shay Kraim kraim@cs.bgu.ac.il Alex Hayuit
Visit us at • www.cs.bgu.ac.il/~bgrfs