1 / 21

Analyzing Peer-to-Peer Traffic Across Large Networks

Explore methodologies, measurements, and data cleaning strategies in analyzing P2P traffic, with insights on host distribution, connectivity, traffic volume, and more. Understand the dynamics, limitations, and impact of P2P protocols on network traffic.

pittenger
Download Presentation

Analyzing Peer-to-Peer Traffic Across Large Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research

  2. P2P applications • Distributed file sharing • Napster, Gnutella, FastTrack, EDonkey, DirectConnect… • Searching v.s. data fetching phases • All the communications occur over default ports • SuperNodes and Hubs • Why is this interesting? • Large and growing traffic volume Analyzing peer-to-peer traffic accoss large networks

  3. Outline • Methodology • Data collection • Characterization metrics • Analysis results • Traffic volume and overlay topology • System dynamics • Traffic characterization • P2P vs Web Analyzing peer-to-peer traffic accoss large networks

  4. Methodology • Challenges • Decentralized system • Transient peer membership • Some popular close proprietary protocols • Large-scale passive measurement • Flow-level data from routers across a large tier-1 ISP backbone • Analyze both signaling and data fetching traffic • 3 levels of granularity: IP, Prefix, AS • P2P protocols • FastTrack:1214 (including Morpheus) • Gnutella:6346/6347 • DirectConnect:411/412 Analyzing peer-to-peer traffic accoss large networks

  5. Methodology Discussion • Advantages • Requires minimal knowledge of P2P protocols: port number • Large scale non-intrusive measurement • More complete view of P2P traffic • Allows localized analysis • Limitations • Flow-level data: no application-level details • Incomplete traffic flows • Other issues • DHCP, NAT, proxy • Host  IP • Asymmetric IP routing Analyzing peer-to-peer traffic accoss large networks

  6. Measurements • Characterization • Overlay network topology • Traffic distribution • Dynamic behavior • Metrics • Host distribution • Host connectivity • Traffic volume • Mean bandwidth usage • Traffic pattern over time • Connection duration and on-time Analyzing peer-to-peer traffic accoss large networks

  7. Data cleaning • Invalid IPs • 10.0.0.0-10.255.255.255 • 172.16.0.0-172.31.255.255.255 • 192.168.0.0-192.168.255.255 • No matched prefixes in routing tables • Invalid AS numbers • > 64512 • Removed 4% flows Analyzing peer-to-peer traffic accoss large networks

  8. Overview of P2P traffic • Total 800 million flow records • FastTrack is the most popular one Analyzing peer-to-peer traffic accoss large networks

  9. Host distribution Analyzing peer-to-peer traffic accoss large networks

  10. Host connectivity FastTrack (9/14/2001) Connectivity is very small for most hosts, very high for few hosts Distribution is less skewed at prefix and AS levels Analyzing peer-to-peer traffic accoss large networks

  11. Traffic volume distribution FastTrack (9/14/2001) • Significant skews in traffic volume across granularities • Few entities source most of the traffic • Few entities receive most of the traffic Analyzing peer-to-peer traffic accoss large networks

  12. Mean bandwidth usage FastTrack (9/14/2001) • Upstream usage < downstream usage. Possible causes are • Asymmetric available BW, e.g., DSL, cable • Users/ISPs rate-limiting upstream data transfers Analyzing peer-to-peer traffic accoss large networks

  13. Time of day effect FastTrack (9/14/2001 GMT) • Traffic volume exhibits very strong time-of-day effect • Milder time-of-day variation for # hosts in the system Analyzing peer-to-peer traffic accoss large networks

  14. Host connection duration & on-time FastTrack (9/14/2001) thd=30min • Substantial transience: most hosts stay in the system for a short time • Distribution less skewed at the prefix and AS levels • Using per-cluster or per-AS indexing/caching nodes may help Analyzing peer-to-peer traffic accoss large networks

  15. Traffic characterization • The power law • May not be a suitable model for P2P traffic • Relationship between metrics • Traffic volume • Number of IPs • On-time • Mean bandwidth usage Analyzing peer-to-peer traffic accoss large networks

  16. Traffic volume vs. on-time FastTrack (9/14/2001): top 1% hosts (73% volume) 1 2 Volume heavy hitters tend to have long on-times Hosts with short on-times contribute small traffic volumes Analyzing peer-to-peer traffic accoss large networks

  17. Connectivity vs. on-time FastTrack (9/14/2001): top 1% hosts (73% volume) 1 2 Hosts with high connectivity have long on-times Hosts with short on-times communicate with few other hosts Analyzing peer-to-peer traffic accoss large networks

  18. P2P vs Web • Observations • 97% of prefixes contributing P2P traffic also contribute Web traffic • Heavy hitter prefixes for P2P traffic tend to be heavy hitters for Web traffic • Prefix stability – the daily traffic volume (in %) from the prefix does not change over days • Experiments: 0.01%, 0.1%, 1%, 10% heavy hitters => 10%, 30%, 50%, 90% of the traffic volume Analyzing peer-to-peer traffic accoss large networks

  19. Traffic stability March 2002 Top 0.01% prefixes Top 1% prefixes P2P traffic contributed by the top heavy hitter prefixes is more stable than either Web or total traffic Analyzing peer-to-peer traffic accoss large networks

  20. Summary • Measure and characterize P2P traffic across a large network • Three popular P2P systems • Significant increase in both number of users and traffic volume • Traffic distributions are highly skewed • High level system dynamics • P2P is significant, but stable component of the Internet traffic Analyzing peer-to-peer traffic accoss large networks

  21. Acknowledgement • AT&T Labs • Matt Grossglauser, Carsten Lund, Jennifer Rexford, Matt Roughan, Fred True • External • Steve Gribble Analyzing peer-to-peer traffic accoss large networks

More Related