290 likes | 420 Views
Ongoing Work on Peer-to-Peer Networks. July 30, 2014 Prof. Ben Y. Zhao ravenben@cs.ucsb.edu. Large-scale Network Applications. Context trend: applications increasing in scale (clients, distribution)
E N D
Ongoing Work on Peer-to-Peer Networks July 30, 2014 Prof. Ben Y. Zhao ravenben@cs.ucsb.edu
Large-scale Network Applications • Context • trend: applications increasing in scale (clients, distribution) • examples: wide-area real-time multicast (PayPerView), data dissemination (PointCast), large distributed FS • mutual sharing of data, inter-node communication • Challenges as networks scale • how does A find a particular piece of data? • start with simple name (complex queries later) • how does node A send message to B?IP addresses are too static, need app-level location independent names • how do we make communication reliable? ravenben@cs.ucsb.edu
Adding Name-based Structure to Networks • No network structure • any node can connect to any node: flexible • scalability issue: too many routing table entries • need more flexible naming, IP address too static • Trade off flexibility for scalability • name all nodes with application level nodeIDs • route incrementally to destination • use nodeID as measure of routing progress ravenben@cs.ucsb.edu
Outline • Motivation • Protocols • routing: Chord, Tapestry/Pastry, etc… • dynamic algorithms • Application Interfaces • Ongoing Work • Wrap-up ravenben@cs.ucsb.edu
Structured Peer-to-Peer Overlays • Assign random nodeIDs and keys from secure hash • incrementally route towards destination ID • each node has small set of outgoing routes, e.g. prefix routing ID: ABCE ABC0 To: ABCD AB5F A930 ravenben@cs.ucsb.edu
What’s in a Protocol? • Definition of name-proximity • each hop gets you “closer” to destination ID • prefix routing, numerical closeness, hamming distance • Size of routing table • amount of state kept by each node as f (N), N = network size • # of overlay routing hops • worst case routing performance (in overlay hops, not IP) • Network locality • does choice of neighbor consider network distance • impact on “actual” performance of P2P routing • Application Interface ravenben@cs.ucsb.edu
Chord • NodeIDs are numbers on ring • Closeness defined by numerical proximity • Finger table • keep routes for next node 2i away in namespace • routing table size: log2 n • n = total # of nodes • Routing • iterative hops from source • at most log2 n hops Node 0/1024 0 128 896 256 768 640 384 512 ravenben@cs.ucsb.edu
Chord II • Pros • simplicity • Cons • limited flexibility in routing • neighbor choices unrelated to network proximity* but can be optimized over time • Application Interface: • distributed hash table (DHash) ravenben@cs.ucsb.edu
Tapestry / Pastry • incremental prefix routing • 11110XXX00XX 000X0000 • routing table • keep nodes matching at least i digits to destination • table size: b * logb n • routing • recursive routing from source • at most logb n hops Node 0/1024 0 128 896 256 768 640 384 512 ravenben@cs.ucsb.edu
2175 0157 0154 0123 0880 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 4 4 4 4 4 5 5 5 5 5 7 7 7 7 7 3 3 3 3 3 6 6 6 6 6 Neighbor Map For “2175” (Octal) 0xxx 20xx 210x 2170 1xxx ---- 211x 2171 ---- 22xx 212x 2172 3xxx 23xx 213x 2173 4xxx 24xx 214x 2174 5xxx 25xx 215x ---- 6xxx 26xx 216x 2176 7xxx 27xx ---- 2177 4 3 2 1 Routing Levels Routing in Detail Example: Octal digits, 212 namespace, 2175 0157 2175 0880 0123 0154 0157 ravenben@cs.ucsb.edu
Tapestry / Pastry II • Pros • large flexibility in neighbor choicechoose nodes closest in physical distance • can tune routing table size and routing hops using parameter b • Cons • more complex than Chord to implement • Application Interface • Tapestry: decentralized object location • Pastry: distributed hash table ravenben@cs.ucsb.edu
Lots and Lots of Protocols • Other designs: Kademlia, Coral, etc… • For network locality • SkipNet / Skip graphs • LAND • For theory • Viceroy (dynamic adaptation of butterfly network) • Symphony • Ulysseus • For performance • One-hop / Two-hop Routing (static hierarchy) • Brocade (two layered approach for WAN routing) • XRing / ZRing ravenben@cs.ucsb.edu
Questions? ravenben@cs.ucsb.edu
Outline • Motivation • Protocols • Application Interfaces • distributed hash tables • decentralized object location • multi-tiered interfaces • Ongoing Work • Wrap-up ravenben@cs.ucsb.edu
How Do We Use Them? • Key-based Routing layer • Large sparse ID space N(160 bits: 0 – 2160 represented as base b) • Nodes in overlay network have nodeIDs N • Given k N, overlay deterministically maps kto its root node (a live node in the network) • Main routing call • route (key, msg) • Route message to node currently responsible for key • Leveraging KBR • storage: pick a node by name to store data • routing: route messages between nodes by nodeID ravenben@cs.ucsb.edu
Storage: Distributed Hash Tables • P2P layer = a hash table • key is object ID • write (key, data) • store data at k nodes closest in name to key • k is system parameter (replication factor) • data = read (key) • read data from any of k nodes close to key ravenben@cs.ucsb.edu
DHT II • Pros • simplicity, just store and forget • rely on storage layer to keep data available across changes in node membership • servers generally distance in network and fault-independent • Cons • P2P layer controls parameterswhere data is stored, how many replicasone size rarely fits all • lack of network locality ravenben@cs.ucsb.edu
backbone Storage: Decentralized Object Location • let application choose where to store data • P2P layer provides directory service to locate objects • redirect data traffic using log(n) in-network redirection pointers • average # of pointers/machine: log(n) * avg files/machine routeobj(k) routeobj(k) k publish(k) k ravenben@cs.ucsb.edu
DOLR II • Pros • application has control over data placementcan optimize location, replication factor for performance • Cons • directory pointers require state in network • additional complexity in managing data ravenben@cs.ucsb.edu
Outline • Motivation • Protocols • Application Interfaces • Ongoing Work • Wrap-up ravenben@cs.ucsb.edu
Many Structured P2P Applications • Data storage • file systems, FS backup, stegnographic FS • Web Caches, CDNs, DNS services • Search on P2P • service discovery, P2P databases • Routing • application-level multicast, pub/sub • resilient routing tunnels • Others • spam-filtering, collaborative network measurement, machine fault diagnosis ravenben@cs.ucsb.edu
But Few (if any) Are Deployed • Deployment outside of research networks • still file-sharing based (Kenosis/BitTorrent, E-Donkey) • Why? Is P2P only good for file-sharing? • Consider some factors • killer app with real user demand • usability (software engineering) • incentives vs. per user cost • security • How do we do about this? ravenben@cs.ucsb.edu
Addressing P2P Security • The problem • lots of users, spread over wide-area, multiple network domains • no uniform security policy or management capability • result: expect compromised nodes in normal operation • Existing work • secure routing: trade off efficiency for improved resilience against collusion • prognosis is bleak • one against many (colluding attackers) • An alternative • balance the scale: many against many • form trusted groups to collaboratively stave off attackers • incrementally build trust groups with anonymous verification ravenben@cs.ucsb.edu
P2P Security cont. • Mechanisms • highly dynamic collaborative reputation system • todo • anonymous communication in P2P • under development: Cashmere • Policies / algorithms • how to perform anonymous verification • how to derive / adapt online reputations • A related topic • what are the weaknesses of current P2P systems • what are the main methods of attack? • how do we perform / protect against these attacks? ravenben@cs.ucsb.edu
Finding the Right Incentive/Cost Model • Deployment • current focus on infrastructure services • need useful, light-weight apps for home users • Design and implement Quartz • lightweight p2p data sharing system • store your most critical files (<100MB) online • use simple application-specific handlers to provide fast data synchronization (a la CVS) • synchronize your HTML bookmarks across machines • synchronize your papers, homework files, financial records • end to end encryption ravenben@cs.ucsb.edu
Other Directions • Understanding unstructured P2P systems • understanding content-based centralization and its implications • edge-based measurements of Freenet • studies of Maze, an academic P2P system from China • More P2P applications • Reliable and efficient event propagation(large-scale distributed gaming) • P2P Ebay, (secure online commerce) ravenben@cs.ucsb.edu
Other Directions cont. • Applying decentralized algorithms elsewhere • routing and data management in sensor and ad-hoc networks • reduce routing state and flooding traffic • energy efficient data aggregation in sensor nets • spectrum allocation and MAC-layer device coordination ravenben@cs.ucsb.edu
Outline • Motivation • Protocols • Application Interfaces • Ongoing Work • Wrap-up ravenben@cs.ucsb.edu
Finally… • Structured Peer-to-Peer Networks are useful (& fun) • holds promise for self-maintaining decentralized networks at Internet scales • relevance to numerous areas in CS • sensor networks, ad-hoc routing, security, theory • For more information … • see the webpage for my Winter 290http://www.cs.ucsb.edu/~ravenben/classes/290F • see papers from IPTPS: http://iptps05.cs.cornell.edu/http://iptps04.cs.ucsd.eduhttp://iptps03.cs.berkeley.edu ravenben@cs.ucsb.edu