600 likes | 610 Views
Characteristics of Current P2P File-Sharing Systems (with a brief excursion into network measurement tools). Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington. Peer-to-Peer Frenzy. Both research and industrial excitement
E N D
Characteristics of Current P2P File-Sharing Systems(with a brief excursion into network measurement tools) Stefan Saroiu P. Krishna Gummadi Steven Gribble University of Washington
Peer-to-Peer Frenzy • Both research and industrial excitement • CAN, Chord, Past, Tapestry, JXTA, Farsite, Publius, Morpheus, AudioGalaxy • Basic Premise • wide-area, distributed system • voluntary, ad-hoc, dynamic home-user peers exchange information (mostly large files) • Many proposals, yet nobody knows the participating peers’ characteristics and behavior
Napster & Gnutella napster.com P P D S S Q P P P P R S S P R R Q P P Q Q P Q D P P Q P Napster Gnutella Q peer query P D file download R response server S
Methodology 2 stages: • periodically crawl Gnutella/Napster • discover peers and their metadata • feed output from crawl into measurement tools: • bottleneck bandwidth – SProbe • latency – SProbe • peer availability – LF • degree of content sharing – Napster crawler
Network Bandwidth Scenarios • Network measurements • Dynamic server/peer selection • P2P overlay formation • or application-level multicast • Placement of content replicas
Network Bandwidth • Throughput: • number of transferred bytes during a fix interval of time • Available bandwidth: • the maximum attainable throughput of a newly started flow • Bottleneck bandwidth: • maximum throughput ideally obtained across the slowest link • Hard to measure: • throughput, available bandwidth • Easier to measure: • bottleneck bandwidth
One-Packet Model probing packet Traversal Time 1 slope = bottleneck bandwidth Packet Size
Packet-Pair Model packet size = bottleneck bandwidth Δt bottleneck bandwidth time dispersion proportional to bottleneck bandwidth
Vital Properties of an Ideal Tool • Accurate • Fast: • 1 min/measurement too slow • Scalable: • flooding the network will not work • Works in Uncooperative Environments • can’t deploy software at both endpoints
Properties of an Ideal Tool • Active: • existent traffic might not be suitable • TCP/UDP based: • ICMP heavily filtered • Cross-traffic resilient: • should detect and give up in the face of cross traffic • Works on Asymmetric Paths • Flexible to Bandwidth Changes • Controlled Evaluations
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From local host To remote host • No cooperation needed Local Remote SYN packet RST packet
SProbe Uses TCP Tricks • From remote To local • Involuntary cooperation of application layer Local Remote (Web) HTTP Get request Data packet ACK (last data packet)
More SProbe • Bottleneck Bandwidth • Latency • Availability (LF): • send a SYN packet • receive: • SYN/ACK – host active • RST – host inactive, but online • nothing – host offline
P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?
P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?
Availability • Period probes yield data like: end start
Availability • Period probes yield data like: • Divide into two periods • Keep segments that: • start in 1st period • end in 1st or 2nd periods • draw conclusion only on segments no larger than 2nd period end start 12 hours
P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?
P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?
P2P Characteristics • How many peers are “server-like”? • Who are the free-riders? • Do peers tend to lie? • How robust is the Gnutella overlay?
Power-Law Networks are here to Stay • Barabasi and Albert showed that networks which… • grow by continuous addition of new nodes • exhibit preferential attachment (likelihood of connecting to a node depends on the node’s degree) • …power-law distribution of vertex degree • Internet, WWW, Gnutella
Resilience to Failures • Power-law networks (Cohen et al.): • very resilient in face of random node failures • a giant spanning cluster still exists • fairly resilient in face of cascading failures • very vulnerable in face of orchestrated attacks (towards high-degree nodes)
Popular sites: • 212.239.171.174 • adams-00-305a.Stanford.EDU • 0.0.0.0 Gnutella 1771 hosts Fri Feb 16 05:21:52-05:23:22 PST
30% random failures 1771 – 471 – 294 hosts Fri Feb 16 05:21:52-05:23:22 PST