430 likes | 572 Views
Measurements, Analysis, and Modeling of BitTorrent-like Systems. Lei Guo 1 , Songqing Chen 2 , Zhen Xiao 3 , Enhua Tan 1 , Xiaoning Ding 1 , and Xiaodong Zhang 1 1 College of William and Mary 2 George Mason University, 3 AT & T Labs - Research.
E N D
Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo1, Songqing Chen2, Zhen Xiao3, Enhua Tan1, Xiaoning Ding1, and Xiaodong Zhang1 1College of William and Mary 2George Mason University, 3AT & T Labs - Research
Peers sharing different files self-organize into a P2P network Exchange files they desire Limitations Free riding Large file downloading ♫ Basic Model of P2P Systems Examples: Gnutella, KaZaa, eDonkey/eMule/Overnet
4 5 ... BitTorrent: Fast Delivery with Incentive • A large file is divided into chunks • Peers interested in the same file self-organize into a torrent • Peers exchange file chunks with each other • Incentive is established by tit for tat • Very simple and effective, scale fairly well during flash crowd Torrent of Bits
BitTorrent Traffic • Online users • 6.8 million in August 2004, 9.6 million in August 2005 (BigChampagne) • Traffic volume • 53% of all P2P traffic on the Internet in June 2004 (CacheLogic) P2P traffic: 60-80% Other traffic: 20-30% Source: CacheLogic, 2004
Limited Understanding of BitTorrent • Existing studies on BitTorrent systems (INFOCOM04, SIGCOMM04) • Unrealistic assumptions in system model: no evolution considered • Single-torrent based: more than 85% BT users join multiple torrents • What we are not clear about BitTorrent systems • Service availability • Service stability • Service fairness • Our objective of this work • Evolution of single-torrent system, and limitations of BT • Multi-torrent model for inter-torrent relation and collaboration during the entire lifetime
Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Conclusion
seed foo.torrent ... 3 4 5 I am here! foo.torrent Tracker site Web site How BitTorrent Works: Publishing The publisher • Create a meta file • Publish on a Web site • Start the tracker site • Start a BT client as the initial seed announce: tracker URL for bootstrap creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk peer list
seed ... 3 4 5 foo.torrent foo.torrent Tracker site Web site peer list download I am here! peer list How BitTorrent Works: Downloading The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds)
seed ... 3 4 5 foo.torrent foo.torrent foo.torrent Tracker site Web site peer list How BitTorrent Works: Downloading The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds) • Exchange file chunk with other peers • Download complete: become a new seed
seed ... ... 3 3 4 4 5 5 foo.torrent foo.torrent foo.torrent Tracker site Web site How BitTorrent Works: Downloading Future performance Depends on the arrival and departure of new downloaders and seeds The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds) • Exchange file chunk with other peers • Download complete: become a new seed • Initial seed leaves peer list seed
Our Methodology of this Study • Measurement • BitTorrent traffic pattern • Meta file downloading and tracker statistics • Analysis • BitTorrent user behavior and performance limitations • Curve fitting, parameter estimation and validation of mathematical models • Modeling • Torrent evolution and inter-torrent relation • Fluid model, probability model, and graph model
announce: tracker URL creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk foo.torrent Meta File Downloading • The first HTTP packets of .torrent file downloading • Cable network: 3,000+ downloads, 1,000+ torrent meta files • Server farm: 50 tracker sites host hundreds of torrents • Gigasope: fast Internet traffic monitoring tool by AT&T • What information it contains? • Torrent birth time • Peer arrival time to the torrent (packet capture time of downloading) • About 10 days
Torrent Statistics on Trackers • Professional/dedicated tracker sites • Each may host thousands of torrents at the same time • http://www.alluvion.org/ and http://www.crapness.com/, collected by University of Massachusetts, Amherst • Ex: alluvion -- 1,500 torrents, 550 are fully traced • What information it contains? • Torrents: torrent birth time, file size, number of peers/seeds • Peers: request time, downloading/uploading bytes, downloading/uploading bandwidth • Sampled every 0.5 hour for 48 days
Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • The evolution of torrent over time • Limitations of current BitTorrent systems • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Conclusion
meta file workload tracker site workload individual torrents CCDF of peer arrival 103 ------ raw data ------ linear fit ------ raw data ------ linear fit 30 104 102 20 relative deviation (%) 102 101 10 100 100 0 0 100 200 0 20 40 100 300 500 torrents ranked by population (non ascending order) Peer arrivals: decrease with time exponentially Peer arrival rate Torrent Popularity 6% in average time after torrent birth (day) derivative of CCDF
peer arrival rate: inter-arrival time: seed leaving rate: seed service time: downloading rate: downloading time: peer n peer n+1 t tn tn+1 Torrent Death Peer n arrives at time tn : Whentn , what will happen? inter-arrival time > seed service time torrent dead
104 104 trace model trace model 102 torrent lifespan (hour) 102 torrent population 100 100 0 200 400 600 100 101 102 103 rank of torrents torrents Torrent Population and Lifespan Most torrents are small (avg 102) Most torrents are short live (avg 8 days)
Define: Avg downloading failure ratio about 10% Different evolution patterns Small population large Rfail Reminder: most torrents have small population! Altruistic peers make torrents long live 104 100 population download failure 10-1 102 downloading failure ratio torrent population 10-2 100 10-3 0 200 400 600 torrents ranked in non-ascending order of downloading failure ratio Downloading Failure Ratio
Torrent Evolution: Fluid Model • Existing model (SIGCOMM 04) • Constant arrival rate = const • Torrent reaches equilibrium • The correct model • Exponentially decreasing arrival rate • Torrent dead finally • Verified by our measurements • Two completely different pictures
Flash crowd Downloader #: exponentially Seed #: exponentially Peek time A very short duration Constant arrival model: flat peak Attenuation – a long tail Downloader #: exponentially Seed #: exponentially Constant arrival model is far from the reality: no attenuation Torrent death 80 trace model 40 0 100 200 80 trace model 40 0 100 200 time (hour) Torrent Evolution: Modeling Results constant arrival model # of downloaders constant arrival model # of seeds
Snapshot of torrents at time t 104 15 101 105 model trace 10 download speed 10 8 avg download speed (byte/sec) 103 6 # of peers 5 4 101 2 0 downloader seed 50 100 150 200 time (hour) 0 50 100 150 200 torrents Performance Stability Evolution over time avg download speed (byte/sec) Only stable when torrent is large Fluctuate significantly after peak time Larger torrents have higher and more table performance
102 106 102 103 + contribution ratio + contribution ratio 102 100 # of torrents 100 peer contribution ratio peer contribution ratio download speed (byteps) 104 101 10-2 –x– # of torrents –x–download speed 10-2 100 ranked peers ranked peers 0 0.2 0.4 0.6 0.8 1 102 0 0.2 0.4 0.6 0.8 1 Contribution ratio: uploaded bytes downloaded bytes Service Unfairness • Unfairness: download speed, uploading contribution • Seeds serve high speed downloaders first • Peers not willing to serve after downloading • Not due to new file downloading: selfish
Single-torrent Model : Summary • Torrent evolution over time • Exponentially decreasing arrival rate • Flash crowd – short peak – long tailed attenuation • BitTorrent Limitations • Content availability: torrent death • Performance stability • Service fairness
Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Traffic pattern and user behavior • Graph based model of inter-torrent relation • Inter-torrent collaboration • Conclusion
avg # of torrents a peer requests torrent request rate peer birth rate = =constant Multi-torrent Environment Dynamics Torrent birth Request arrival Peer birth CDF of torrents CDF of peers CDF of requests ------ raw data ------ linear fit ------ raw data ------ asymptotic fit ------ raw data ------ linear fit Torrent birth time, request arrival time, and peer birth time (hour) • Considering peers and torrents on the Internet as an open system • Torrent birth rate, torrent request rate, and peer birth rate are constant • Implication: • The lifecycle of a BT peer: downloading, seeding, sleeping, …, dead
Peer Request Pattern: Request Rate Peer request rate: requests by a peer to different torrents per unit time 102 108 101 104 r (day) # of torrents Assume –x– # torrents + r 100 100 r 77 years ! 0 2000 4000 peers • Peer request process: seems Poisson-like • Request a new torrent with a probability p: participation probability • Dead with probability 1-p
––– raw data ––– linear fit 40 number of torrents (m) 20 0 100 102 104 peer rank (logi) Peer Request Pattern: Participation Probability Probability model peers request at least m torrents p = 0.8551 Another estimation of p Probability model confirmed
i j Inter-torrent Relation Graph: How Torrents Can Help with Each Other? some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i
i j trace model torr size weighted out-degree torrents torrent size (# of online peers) trace model torr size weighted in-degree torrents Inter-torrent Relation Graph: How Torrents Can Help with Each Other? • Edge weight Wi,j: number of such peers some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i
Single-torrent vs. Multi-torrent Model • Single-torrent model • seed service time, download failure rate • Limited seed service time , but inter-arrival time exponentially • Small improvement • Multi-torrent model • Old peers come back multiple times • peer arrival rate, peer inter-arrival time • Significant improvement
Single-torrent vs. Multi-torrent Model Single-torrent model Multi-torrent model 0.1 seeds stay 10 times longer: *=/10 torrent death ' (T'life)= 0.01 110-6 ≈ 0 Inter-torrent collaboration is much more effective than stimulating seeds to serve longer
Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Tracker site overlay • Instant incentive for collaboration • Conclusion
Tracker Site Overlay B Neighbor-in torrents that can serve me B C A Neighbor-out torrents that I can serve (peer list) D D C • Self-organized P2P network (a logical structure) • An instance ofinter-torrent relation graph • A built-in mechanism for content search, cover 99%+ torrents • Trackerless BitTorrent: uses DHT to store meta file
file A file D Jack Thanks Jack! Incentive for Inter-Torrent Collaboration B A C D Tom Instantincentive – similar to “tit-for-tat” principle • Neighboring cycle detection • Neighboring cycle construction • Bandwidth trading: get one chunk, serve multiple peers
Conclusion • Extensive analysis and modeling to study the behaviors of BT-like systems • Tracker trace and .torrent downloading trace • Mathematical model • BitTorrent system has its limitations due to exponentially decreasing peer arrival rate • Service availability, performance stability, and fairness • Graph based multi-torrent model • System design for inter-torrent collaboration
torrent lifespan 104 trace model torrent lifespan (hour) 102 100 0 200 400 600 torrents Torrent Lifespan • Extract tandtfrom trace • Get 0 and using linear regression • Lifespan model verified by measurement
104 trace model 102 torrent population 100 100 101 102 103 rank of torrents (in non-ascending order of modeling results) Torrent Population Total population • Model verified by measurement • Observations: • The population of most torrents are small (102 in average) • Downloading failure ratio • Small population large Rfail
Torrent Evolution: Fluid Model Basic equation set Parameters Resolution
Peer Request Pattern: Summary • Multi-torrent environment: an open model • Torrent birth rate: 0.9454 per hour (nearly a constant) • Peer birth rate: 19.37 per hour (nearly a constant) • Torrent request rate (for all peers over all torrents): 133.39 per hour (nearly a constant) • Actually increase slowly according to BigChampagne • Peer request pattern • Lifecycle: downloading, seeding, sleeping, …, next req with prob. p • Peer participation probability: 0.85 • Request rate (for different torrents by a peer): Poission-like
Tracker Site Overlay • Table size • Node degree distribution • Similar to unstructured P2P networks • Many content search and msg routing algorithms • Flooding • Random walk • … • Trackerless BitTorrent: uses DHT to store meta file
Simulation Experiments without inter-collaboration with inter-collaboration performance stability service fairness content availability downloading failure ratio downloading speed contribution ratio Rfail 0 more stable more balanced Inter-torrent collaboration can improve BitTorrent performance