1 / 60

Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington

Characteristics of Current P2P File-Sharing Systems (with a brief excursion into network measurement tools). Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington. Peer-to-Peer Frenzy. Both research and industrial excitement

irisjensen
Download Presentation

Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Characteristics of Current P2P File-Sharing Systems(with a brief excursion into network measurement tools) Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington

  2. Peer-to-Peer Frenzy • Both research and industrial excitement • CAN, Chord, Past, Tapestry, JXTA, Farsite, Publius, Morpheus, AudioGalaxy • Basic Premise • wide-area, distributed system • voluntary, ad-hoc, dynamic home-user peers exchange information (mostly large files) • Many proposals, yet nobody knows the participating peers’ characteristics and behavior

  3. Napster & Gnutella napster.com P P D S S Q P P P P R S S P R R Q P P Q Q P Q D P P Q P Napster Gnutella Q peer query P D file download R response server S

  4. Methodology 2 stages: • periodically crawl Gnutella/Napster • discover peers and their metadata • feed output from crawl into measurement tools: • bottleneck bandwidth – SProbe • latency – SProbe • peer availability – LF • degree of content sharing – Napster crawler

  5. Network Bandwidth Scenarios • Network measurements • Dynamic server/peer selection • P2P overlay formation • or application-level multicast • Placement of content replicas

  6. Network Bandwidth • Throughput: • number of transferred bytes during a fix interval of time • Available bandwidth: • the maximum attainable throughput of a newly started flow • Bottleneck bandwidth: • maximum throughput ideally obtained across the slowest link • Hard to measure: • throughput, available bandwidth • Easier to measure: • bottleneck bandwidth

  7. One-Packet Model probing packet Traversal Time 1 slope = bottleneck bandwidth Packet Size

  8. Packet-Pair Model packet size = bottleneck bandwidth Δt bottleneck bandwidth time dispersion proportional to bottleneck bandwidth

  9. Vital Properties of an Ideal Tool • Accurate • Fast: • 1 min/measurement too slow • Scalable: • flooding the network will not work • Works in Uncooperative Environments • can’t deploy software at both endpoints

  10. Properties of an Ideal Tool • Active: • existent traffic might not be suitable • TCP/UDP based: • ICMP heavily filtered • Cross-traffic resilient: • should detect and give up in the face of cross traffic • Works on Asymmetric Paths • Flexible to Bandwidth Changes • Controlled Evaluations

  11. Current Tools

  12. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  13. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  14. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  15. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  16. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  17. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  18. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  19. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  20. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  21. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  22. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  23. SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet

  24. SProbe Uses TCP Tricks • From remote To local • Involuntary cooperation of application layer Local Remote (Web) HTTP Get request Data packet ACK (last data packet)

  25. SProbe’s Accuracy

  26. SProbe’s Accuracy

  27. More SProbe • Bottleneck Bandwidth • Latency • Availability (LF): • send a SYN packet • receive: • SYN/ACK – host active • RST – host inactive, but online • nothing – host offline

  28. P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?

  29. P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?

  30. Higher Downstream Bandwidths

  31. Most Peers have Cable Modem-like Bandwidths

  32. Yes, Lots of Cable Modems

  33. Closest 20% are 4X closer than furthest 20%

  34. Two horizontal bands – East Coast and Transoceanic Links

  35. Availability • Period probes yield data like: end start

  36. Availability • Period probes yield data like: • Divide into two periods • Keep segments that: • start in 1st period • end in 1st or 2nd periods • draw conclusion only on segments no larger than 2nd period end start 12 hours

  37. Median Session is about one hour (same for both systems)

  38. Gnutella/Napster Uptime

  39. P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?

  40. Who Has the Files?

  41. Who Has the Files?

  42. Correlation of Free-Riding with B/W

  43. P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?

  44. It’s all about incentive!

  45. Lack of Knowledge is Universal

  46. P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?

  47. Power-Law Networks are here to Stay • Barabasi and Albert showed that networks which… • grow by continuous addition of new nodes • exhibit preferential attachment (likelihood of connecting to a node depends on the node’s degree) • …power-law distribution of vertex degree • Internet, WWW, Gnutella

  48. Resilience to Failures • Power-law networks (Cohen et al.): • very resilient in face of random node failures • a giant spanning cluster still exists • fairly resilient in face of cascading failures • very vulnerable in face of orchestrated attacks (towards high-degree nodes)

  49. Popular sites: • 212.239.171.174 • adams-00-305a.Stanford.EDU • 0.0.0.0 Gnutella 1771 hosts Fri Feb 16 05:21:52-05:23:22 PST

  50. 30% random failures 1771 – 471 – 294 hosts Fri Feb 16 05:21:52-05:23:22 PST

More Related