260 likes | 413 Views
Peer-to-Peer Filesystems. Tom Roeder CS414 2005sp. Nature of P2P Systems. We discussed this a little in 415 on Friday P2P: communicating peers in the system normally an overlay in the network In some sense, P2P is older than the name many protocols used symmetric interactions
E N D
Peer-to-Peer Filesystems Tom Roeder CS414 2005sp
Nature of P2P Systems • We discussed this a little in 415 on Friday • P2P: communicating peers in the system • normally an overlay in the network • In some sense, P2P is older than the name • many protocols used symmetric interactions • not everything is client-server • What’s the real definition? • no-one has a good one, yet • depends on what you want to fit in the class
Nature of P2P Systems • Standard definition • symmetric interactions between peers • no distinguished server • Minimally: is the Web a P2P system? • We don’t want to say that it is • but it is, under this definition • I can always run a server if I want: no asymmtery • There must be more structure than this • Let’s try again
Nature of P2P Systems • Recent definition • No distinguished initial state • Each server has the same code • servers cooperate to handle requests • clients don’t matter: servers are the P2P system • Try again: is the Web P2P? • No, not under this def: servers don’t interact • Is the Google server farm P2P? • Depends on how it’s set up? Probably not.
Overlays • Recall: two types of overlays • Unstructured • No infrastructure set up for routing • Random walks, flood search • Structured • Small World Phenomenon: Kleinberg • Set up enough structure to get fast routing • We will see O(log n) • For special tasks, can get O(1)
Overlays: Unstructured • From Gribble • a common unstructured overlay • look at connectivity • more structure than it seems at first
Overlays: Unstructured • Gossip: state synchronization technique • Instead of forced flooding, share state • Do so infrequently with one neighbor at a time • Original insight from epidemic theory • Convergence of state is reasonably fast • with high probability for almost all nodes • good probabilistic guarantees • Trivial to implement • Saves bandwidth and energy consumption
Overlays: Structured • Need to build up long distance pointers • think of routing within levels of a namespace • eg. namespace is 10 digit numbers base 4 • 0112032101 • then you can hop levels to find other nodes • This is the most common structure imposed
Distributed Hash Tables • One way to do this structured routing • Assign each node each node an id from space • eg. 128 bits: SHA-1 salted hash of IP address • build up a ring: circular hashing • assign nodes into this space • Value • diversity of neighbors • even coverage of space • less chance of attack?
Distributed Hash Tables • Why “hash tables”? • Stored named objects by hash code • Route the object to the nearest location in space • key idea: nodes and objects share id space • How do you find an object without its name? • Close names don’t help because of hashing • Cost of churn? • In most P2P apps, many joins and leaves • Cost of freeloaders?
Distributed Hash Tables • Dangers • Sybil attacks: one node becomes many • id attacks: can place your node wherever • Solutions hard to come by • crytpo puzzles / money for IDs? • Certification of routing and storage? • Many routing frameworks in this spirit • Very popular in late 90s early 00s • Pastry, Tapestry, CAN, Chord, Kademlia
Applications of DHTs • Almost anything that involves routing • illegal file sharing: obvious application • backup/storage • filesystems • P2P DNS • Good properties • O(log N) hops to find an id • Non-fate-sharing id neighbors • Random distribution of objects to nodes
Pastry: Node Joins • Find another geographically nearby node • Hash IP address to get Pastry id • Try to route a join message to this id • get routing tables from each hop and dest • select neighborhood set from nearby node • get the leaf set from the destination • Give info back to nodes so they can add you • Assuming the Pastry ring is well set up, this procedure will give good parameters
Pastry: Node Joins • Consider what happens from node 0 • bootstraps itself • next node to come adds itself and adds this node • Neighborhood information will be bad for a while • need a good way to discover network proximity • This is a current research problem • On node leaves, do the reverse • If a node leaves suddenly, must be detected • removal from tables by detecting node
Pastry: Routing • The key idea: grow common prefix • given an object id, try to send to a node with at least one more digit in common • if not possible, send to a node that is closer numerically • if not possible, then you are the destination • Gives O(log N) hops • Each step gets closer to destination • Guaranteed to converge
PAST: Pastry Filesystem • Now a simple filesystem follows: • to get a file, hash its name and look up in Pastry • to store a file, store it Pastry • Punt on metadata/discovery • Can implement directories as files • Then just need to know the name of root • Shown to give reasonable utilization of storage space
PAST: File Replication • Since any one node might fail, replicate • Uses the neighbor set for k-way storage • Keeps the same file at each neighbor • Diversity of neighbors helps fate-sharing • Certification • Each node signs a certificate • Says that it stored the file • Client will retry storage if not enough certificates • OK guarantees
PAST: Tradeoffs • No explicit FS structure: • Could build any sort of system by storing files • Basically variable-sized block storage mechanism • This buys simplicity at the cost of optimization • Speed vs. storage • See Beehive for this tradeoff • Makes it an explicit formula; can be tuned • Ease of use vs. security • Hashes make file discovery non-transparent
Rationale and Validation • Backing up on other systems • no fate sharing • automatic backup by storing the file • But • Cost much higher than regular filesystem • Incentives: why should I store your files? • How is this better than tape backup? • How is this affected by churn/freeloaders • Will anyone ever use it?
PAST: comparsion to CFS • CFS: a filesystem built on Chord/DHash • Pastry is MSR, Chord/DHash is MIT • Very similar routing and storage
PAST: comparison to CFS • PAST stores files, CFS blocks • Thus CFS can use more fine-grained space • lookup could be much longer • get each block: must go through routing for each • CFS claims: ftp-like speed • Could imagine much faster: get blocks in parallel • thus routing is slowing them down • Remember: hops here are overlay, not internet, hops • Load balancing in CFS • predictable storage requirements per file per node
References • A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems". IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001. • A. Rowstron and P. Druschel, "Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility", ACM Symposium on Operating Systems Principles (SOSP'01), Banff, Canada, October 2001. • Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160.
References • Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica, Wide-area cooperative storage with CFS, ACM SOSP 2001, Banff, October 2001. • Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A Measurement Study of Peer-to-Peer File Sharing Systems, Proceedings of Multimedia Computing and Networking 2002 (MMCN'02), San Jose, CA, January 2002.Kleinberg • C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, Newport, Rhode Island, pages 311-320, June 1997.
Conclusions • Tradeoffs are critical • Why are you using it? • What sort of security/anonymity guarantees? • DHT applications • Think of a good one and become famous • PAST • caches whole files • Save some routing overhead • Harder to implement true filesystem