180 likes | 338 Views
Plethora: Infrastructure and System Design. Introduction. Peer-to-Peer (P2P) networks: Self-organizing distributed systems Nodes receive and provide services cooperatively No predetermined client/server roles Key features: Scalable Adaptive and reconfigurable
E N D
Introduction • Peer-to-Peer (P2P) networks: • Self-organizing distributed systems • Nodes receive and provide services cooperatively • No predetermined client/server roles • Key features: • Scalable • Adaptive and reconfigurable • Leverage technology trends (network/processor/memory) • Key problems: • Locating and routing objects efficiently • Consistency management • Fault-Tolerance
Location and Routing - DHT • Apply structure to the network: • Inputs hashed to a key • Each node responsible for a subset of keys • Nodes maintain small routing tables • Queries routed to neighboring nodes that ensure progress towards the ultimate destination.
Location and Routing - DHT 0XXX 1XXX 2XXX 3XXX 0112 2321 START 0112 routes a message to key 2000. 2032 First hop fixes first digit (2) 2001 Second hop fixes second digit (20) END 2001 closest live node to 2000.
Motivation • Virtualization destroys locality. • Query responses do not contain locality information. • Recent studies show that queries for multiple keys in P2P networks follow a Zipf-like distribution. • Exploit geographic locality. • Build highly-distributed collaborative environments and applications: • information lifecycle • distributed file systems • software distribution • archival storage and disaster recover
IP Addresses as Virtual IDs • Incorporate locality into overlay networks: • Explore addressing scheme of the underlying network. • In most cases, nodes with IP addresses that are numerically close are also physically close. • Organization of the Internet in ASs. By correcting a few bits in each hop, the last hops would be inside an AS. • Issues: • IP space is not uniformly populated by peers. • Load imbalance at the peers. • The upper bound of O(log n) can no longer be guaranteed.
IP Addresses as Virtual IDs • 2,420 nodes. 20 keys per node.
Plethora • Two-level overlay • One global overlay • Several local overlays • Global overlay is the main repository of data. • Global overlay helps nodes organize themselves into local overlays. • Local overlays exploit the organization of the Internet in ASs. • Size of the local overlay is controlled by an overlay leader. • Uses efficient distributed algorithms for merging and splitting local overlays.
Simulation Setup • Internet topology generated using GT-ITM topology generator. • 10,000 overlay nodes selected randomly from the hosts. • NLANR web proxy trace with 500,254 objects. • Zipf distribution parameters: {0.75, 0.80, 0.85, 0.90, 0.95} • Local cache size: 5MB (LRU replacement policy).
Summary • IP addresses as virtual IDs: • Overlays with good locality properties. • Non-uniform realworld distribution: • severe load imbalance • no bounded latency • Plethora Routing Core: • Two-level overlay architecture. • Local overlays are created to cluster nodes that are close in the underlying network. • Significant performance gains • Low maintenance overhead
Latency Hiding • For large-scale collaborative and distributed applications: • latency effects are still an issue. • need resliency in the presence of network failures. • Record updates using a transactional versioning system: • Aggregate updates • Distributed conflict resolution
Versioning and Transaction Model T1 Local Home Global home T2 T3 commit Tk Tk+1 Tk+2
Open Issues • Implementation of versioning trees • Efficient update and commit protocols • Dealing with failures (node, network) • Object structure of the repository to exploit versioning semantics • What guarantees can the system provide on object access and consistency of updates?