290 likes | 477 Views
Free Riding on Gnutella. Eytan Adar Bernardo A. Huberman. Network. Users. eDonkey2K. 4,123,688. FastTrack. 2,521,887. Gnutella. 1,516,762. Overnet. 1,146,880. DirectConnect. 294,255. MP2P. 251,137. (www.slyck.com, 06/24/’05). Introduction. Gnutella history:
E N D
Free Riding on Gnutella Eytan Adar Bernardo A. Huberman
Network Users eDonkey2K 4,123,688 FastTrack 2,521,887 Gnutella 1,516,762 Overnet 1,146,880 DirectConnect 294,255 MP2P 251,137 (www.slyck.com, 06/24/’05) Introduction • Gnutella history: • 3/14/00: release by AOL, almost immediately withdrawn • too late: 1,859,340 users on Gnutella on August 25, 2am • many iterations to fix poor initial design • High impact: • Versions implemented • Different designs • Lots of research papers/ideas
A Gnutella search mechanism • Steps: • Node 2 initiates search for file A 7 1 4 2 6 3 5
A A A Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors 7 1 4 2 6 3 5
A A A A Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors • Neighbors forward message 7 1 4 2 6 3 5
A A A A:5 A:7 Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors • Neighbors forward message • Nodes that have file A initiate a reply message 7 1 4 2 6 3 5
A A A:5 A:7 Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors • Neighbors forward message • Nodes that have file A initiate a reply message • Query reply message is back-propagated 7 1 4 2 6 3 5
A:5 A:7 Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors • Neighbors forward message • Nodes that have file A initiate a reply message • Query reply message is back-propagated 7 1 4 2 6 3 5
Gnutella search mechanism • Steps: • Node 2 initiates search for file A • Sends message to all neighbors • Neighbors forward message • Nodes that have file A initiate a reply message • Query reply message is back-propagated • File download directly download A 7 1 4 2 6 3 5
Gnutella Search: Flooding • Simple and robust • No state maintenance needed • High tolerance to node failures • Effective and of low latency • Always find the shortest / fastest routing paths
HOPS = 0 HOPS = 1 HOPS = 2 HOPS = 3 HOPS = 4 HOPS = 5 HOPS = 6 Pure Flooding in P2P Overlay
Problems of Flooding • Loops in Gnutella networks • Caused by redundant links • Result in endless message routing • Current solutions by Gnutella • Detect and discard redundant messages • Limit TTL (time-to-live) of messages
Traffic Minimization: Spanning Tree • Reduce traffic without changing P2P overlay • How much bandwidth can we save? • Average degree of Gnutella nodes: about 3 ~ 5 • N-node spanning tree • N-1 links • N-1 messages for a broadcast • Estimated traffic reduction: about 67% ~ 80% • Bandwidth efficiency is not the only objective
Problems of Spanning Tree • Long latency for flooding • More than 30 hops to cover 95% of nodes • Only 7 hops to cover 95% of nodes by Gnutella flooding • Weak reliability due to node failures • A node failure can disconnect a large portion of network
HOPS = 7 HOPS = 8 HOPS = 9 HOPS = 0 HOPS = 1 HOPS = 2 HOPS = 3 HOPS = 4 HOPS = 5 HOPS = 6 HOPS = 10 HOPS = 11 Flooding in Spanning Tree Spanning Tree
1.5Mbps DSL 1.5Mbps DSL 56kbps Modem 1.5Mbps DSL 10Mbps LAN 1.5Mbps DSL 56kbps Modem 56kbps Modem Gnutella: HeterogeneityAll Peers Equal? (1)
Gnutella: Free RidingAll Peers Equal? (2) • More than 25% of Gnutella clients share no files; 75% share 100 files or less • Conclusion: Gnutella has a high percentage of free riders Adar and Huberman (Aug ’00)
Gnutella Summary • Search by flooding • Self-configuring • Phenomena: • Not all peers equal • Free riding • Problems: • Duplicates due to flooding
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau
Introduction • Scalable Gnutella-like P2P system • Design principles: • Explicitly account for node heterogeneity • Query load proportional to node capacity • Results: • Gia outperforms Gnutella by 3–5 orders of magnitude
GIA: 10,000-foot view • Unstructured, but take node capacity into account • High-capacity nodes have room for more queries: so, send most queries to them • Will work only if high-capacity nodes: • Have correspondingly more answers, and • Are easily reachable from other nodes
GIA Design • Make high-capacity nodes easily reachable • Dynamic topology adaptation • Make high-capacity nodes have more answers • One-hop replication • Search efficiently • Biased random walks • Prevent overloaded nodes • Active flow control • Make high-capacity nodes easily reachable • Dynamic topology adaptation • Make high-capacity nodes have more answers • One-hop replication • Search efficiently • Biased random walks • Prevent overloaded nodes • Active flow control Query
Dynamic Topology Adaptation • Make high-capacity nodes have high degree (i.e., more neighbors) • Per-node level of satisfaction, S: • 0 no neighbors, 1 enough neighbors • Function of: • Node’s capacity ● Neighbors’ capacities • Neighbors’ degrees ● Their age • When S << 1, look for neighbors aggressively
Dynamic Topology Adaptation • The goal of the topology adaptation algorithm is to ensure that high capacity nodes are indeed the ones with high degree and that low capacity nodes are within short reach of higher capacity ones. • We do not drop already poorly-connected neighbors (which could get disconnected).
Active Flow Control • Actively allocation “tokens” to neighbors. • Send query to neighbor only if we have received token from it. • High capacity neighbors get more tokens • We use a token assignment algorithm based on Start-timeFair Queuing (SFQ) .
Summary • GIA: scalable Gnutella • 3–5 orders of magnitude improvement in system capacity • Unstructured approach is good enough!