190 likes | 197 Views
This paper explores the correlation between the topology and path characteristics of overlay networks and the Internet, with a focus on the BitTorrent file-sharing network. The study aims to understand the behavior and performance of P2P file-sharing networks through measurements and analysis.
E N D
Correlating Topology and Path Characteristics of Overlay Networks and the Internet A. Iosup, P. Garbacki, J. Pouwelse, D.H.J. Epema PDS Group, ST/EWI, TU Delft GP2PC’06, in conjunction with IEEE CCGrid2006
Outline • Motivation, Goals, and Statistics • Background: The BitTorrent File-Sharing Network • The MultiProbe Framework • The Measurements Setup • The Results • Using Our Results • Conclusion
~70% - P2P File-Sharing P2P File-Sharing is Growing at a Fast Pace… • P2P file-sharing • Daily 85M users [Pouwelse, ICT Kenniscongres’06] • From 10% to 70% Internet traffic in 5 years: P2P file-sharing is the largest Internet application today[Parker, CacheLogic’04 & ’06]
… we Need to Understand Behavior and Performance … • Measuring Underlay/Overlay Networks • How to build a large-scale infrastructure for measuring P2P and Internet characteristics at the same time? • How to measure a representative part of a P2P network? • Characterizing Overlay Networks and Their Users • Where are the overlay network users located? • What is the geographical distribution of traffic? • What is the connectivity amongst users? • What is the application throughput? • Correlating Underlay/Overlay Measurements • How do P2P file-sharing networks map to their Internet underlay?
… through Measurements of the Largest P2P File-Sharing Network: BitTorrent (arguably) “BitTorrent traffic amounts to 20% of Tier 1 and 2 ISPs traffic! ” “BitTorrent traffic amounts to 50%+ P2P File-Sharing traffic” [Parker, IEEE WCW2005]
The BitTorrent P2P File-Sharing Network • Data as torrents (file, chunks, .torrent) • Peer, Tracker, and Web-site levels • Tit-for-tat: use all available bandwidth • Mostly fresh files (excellent support for spikes in interest – flashcrowds)
Outline • Motivation, Goals, and Statistics • Background: The BitTorrent File-Sharing Network • The MultiProbe Framework • The Measurements Setup • The Results • Using Our Results • Conclusion
The MultiProbe Framework 6 • 1. SiteStats • Select the largest BitTorrent web site (sort by no. torrents/users) • Select the largest torrents (sort by no. users) • Active-start measurements2. GetPeers, 3. PeerPing • Probes initiate contact with other peers • Get bandwidth information • Passive-start measurements4. ListenPeers, 5. TrackPeers • Probes wait to be contacted by other peers • Multi-source traceroute • 6. Post-processing • Automated tools to process 10s of GB of data
The Measurements Setup • Largest Site: Pirates Bay • Active-start vs. Passive-start Torrents
The Results: Geographical Distribution • BitTorrent is now globally represented, EU dominant
The Results: Application-Level Bandwidth • Average bandwidth is 500Kbps, double the one observed 2 years ago [Pouwelse et al., IPTPS’05] • Two groups with similar bandwidth characteristics: • Europe, North America, and Asia • South America, Oceania, and Africa.
The Results: Summary • 100 nodes DAS (The Dutch Grid), 50/300 nodes PlanetLab • Shared data (files) traffic 50 GB/day • 450,000 unique peers, 20M IP addresses • 2000/2000 torrents active-start • 695/750 torrents passive-start (Top150, +95% Top700) • ~40 M recorded events, ~10GB uncompressed data/day • Correlated Internet and overlay network characteristics • Geographical distribution of BitTorrent users • Average bandwidth 500Kbps (doubled in 2 years) • Over 75% of BitTorrent traffic is hidden (full range of TCP ports), while 50% users/25% traffic still on standard ports • Distribution of inter-peer IP path hop count, AS traversals, intra-AS hop count, latency, …
Outline • Motivation, Goals, and Statistics • Background: The BitTorrent File-Sharing Network • The MultiProbe Framework • The Measurements Setup • The Results • Using Our Results • Conclusion
Using these results (1): Compare Previous (IPTPS’05) and Current Measurements • Previously: cover most recent 100 files • Problem: Potentially biased geographical distribution of peers • Conclusion: Internet Provider caching has potential • Solution: cover ALL files with more than 100 users • Confirmed continental distribution (Europe dominant) • Strong location bias in previous measurements (Germany is not dominant) • For some countries, ISP traversals = 0, so local caching can and should be used to increase user experience • New insights regarding average bandwidth, TCP port mapping, AS/ISP coverage
Using these results (2):From BW to Collaborative Downloads • Majority of users connected through asymmetric links • Asymmetric links already at a disadvantage in BitTorrent(upload bandwidth limit, because of tit-for-tat) • Avg. bandwidth of 500Kbps > Asymmetric links (upload) capacity (128-256Kbps), so peers with asymmetric links even more at a disadvantage (BitTorrent tit-for-tat favors better connections) • Exploit: Collaborative Downloading • With download helpers, downloading speed-up to 6x P. Garbacki, A. Iosup, D.H.J.Epema, M. van Steen,2Fast: Collaborative Downloads in Peer-To-Peer Networks,(submitted).
Using these results (3):BitTorrent Now Works at Global Scale • Observed global distribution of users • World-wide geographical location • Greatly increased community size, favoring social interaction • Increased technical knowledge for heavy users (port numbers) • Exploit: Tribler, a new social paradigm in P2P • http://www.tribler.org 4 months, 4500 users, 60000 downloads J. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J.Yang, A. Iosup, D.H.J.Epema, M.Reinders, M. van Steen, H.Sips, Tribler: A Social-Based Peer-to-Peer System, In IPTPS'06, 27-28 February, 2006, Santa Barbara, CA, USA
Using these results (4):How can you leverage these results? • (Anonymized) Data is available for download • 50 files, 40GB uncompressed • http://multiprobe.ewi.tudelft.nl/ • Test/improve your P2P algorithms with realistic workloads • Realistic file sizes • Realistic community sizes • Realistic user geographical location • Realistic user Internet location (hops, latency, bandwidth)
Conclusions and ongoing work • MultiProbeframework for large-scale P2P file sharing measurements • ExperienceBitTorrent, 450,000+ unique peers, observed 50+ TB/dayCorrelated Internet and overlay network characteristics Currently building a P2P Traces Archive for the benefit of the whole community!
Thank you! Questions? Remarks? Observations? Help building our community’sP2P Traces Archive ! MultiProbehttp://multiprobe.ewi.tudelft.nl/p2p@ewi.tudelft.nl http://www.pds.ewi.tudelft.nl/~iosup/index.html[google: “iosup”] Many thanks to Neil Spring (Scriptroute), Paulo Anita (website).