1 / 17

Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca

Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6 – CNRS, Paris, France INRIA, Rocquencourt, France. Outline. DHT-based File Systems Pastis Performance evaluation. Distributed file systems.

Download Presentation

Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6 – CNRS, Paris, France INRIA, Rocquencourt, France JTE HPC/FS

  2. Outline • DHT-based File Systems • Pastis • Performance evaluation JTE HPC/FS

  3. Distributed file systems architecture scalability (number of nodes) * uses a Distributed Hash Table (DHT) to store data JTE HPC/FS

  4. Distributed Hash Tables 91 40 18 75 66 52 32 24 83 JTE HPC/FS

  5. DHTs Asia Europe South America Australia 91 40 18 Overlay network 75 Asia 66 52 32 high latency,low bandwidth between logical neighbors North America Europe Asia 24 83 North America logical address space JTE HPC/FS

  6. Insertion of blocks in DHT 04F2 04F2 E25A C52A 834B k = 8958 5230 8909 3A79 put(8959,block) 8BB2 C52A 3A79 k = 8959 AC78 8954 E25A 5230 8BB2 AC78 root ofkey 8959 8957 895D 8957 block 8954 895D 8909 replica 834B Address space JTE HPC/FS

  7. PAST: Storage System • PAST: Cooperative, archival file storage and distribution • Layered on top of Pastry • Goals: • Strong persistence of the data • High availability • Scalability of the System • Reduced cost (no backup) • Efficient use of pooled resources JTE HPC/FS

  8. Insertion of blocks in DHT 04F2 04F2 E25A C52A 834B k = 8958 5230 8909 3A79 put(8959,block) 8BB2 C52A 3A79 k = 8959 AC78 8954 E25A 5230 8BB2 replica AC78 root ofkey 8959 8957 895D 8957 block 8954 895D 8909 replica 834B Address space JTE HPC/FS

  9. Insertion of blocks in DHT 04F2 04F2 E25A C52A 834B k = 8958 5230 8909 3A79 get(8959,block) 8BB2 C52A 3A79 k = 8959 AC78 8954 E25A 5230 8BB2 replica AC78 8957 895D 8957 block 8954 895D 8909 replica 834B Address space JTE HPC/FS

  10. P2P File systems architecture open(), read(), write(), close(), etc. • files and directories • read-write access semantics • security and access control Ivy / Pastis FS block = get(key) put(key, block) DHT DHash / Past • block store (DHT) - scalability - fault-tolerance - self-organization • message routing JTE HPC/FS

  11. DHT-based file systems • Ivy [OSDI’02] • log-based, one log per user • fast writes, slow reads • limited to small number of users • Oceanstore [FAST’03] • updates serialized by primary replicas • partially centralized system • BFT agreement protocol requires well-connected primary replicas DHT object DHT object DHT object User A’s log User B’s log User C’s log primary replicas secondary replicas JTE HPC/FS

  12. Pastis JTE HPC/FS

  13. Pastis design Design goals • simple • completely decentralized • scalable (network size and number of users) Pastis FS put(key, block) block = get(key) Past storage DHT Pastry routing JTE HPC/FS

  14. Pastis data structures Data structures similar to the Unix file system • inodes are stored in modifiableDHT blocks(UCBs) • file contents are stored in immutable DHT blocks(CHBs) Inode key file inode CHB2 replica sets metadata block addresses file contents UCB CHB1 file contents UCB CHB2 CHB1 DHT address space JTE HPC/FS

  15. Pastis data structures (cont.) • directories contain <file name, inode key> entries • use indirect blocks for large files directory inode file1 inode metadata block addresses file1, key1 file2, key2 … oldcontents metadata block addresses filecontents CHB CHB UCB CHB UCB indirectblock oldcontents filecontents CHB CHB CHB JTE HPC/FS

  16. Content Hash Block (CHB) block key = Hash( block contents ) Content Hash Block • block has to be immutable Solution to check and prevent modification • block contents determine block key • can detect if block is modified block contents data block JTE HPC/FS

  17. timestamp sign(KBpriv) User Certificate Blocks (UCBs) block key = Hash( KBpub ) UCBs are modifiable by the block owner. Question: How to check that the file is modified only by the owner? Protocol • (KBpub, KBpriv) associated to each block • The owner builds a signature of the block using KBpriv. Authentication • Verify signature of UCB using the KBpub inode contents UCB JTE HPC/FS

More Related