10 likes | 119 Views
A Measurement Study of Napster and Gnutella. A more complete analysis of the measurements is to appear in Multimedia Computing and Networking (MMCN) 2002. A tech report is available online at: http://www.cs.washington.edu/homes/gummadi/p2ptechreport.pdf.
E N D
A Measurement Study of Napster and Gnutella A more complete analysis of the measurements is to appear in Multimedia Computing and Networking (MMCN) 2002. A tech report is available online at: http://www.cs.washington.edu/homes/gummadi/p2ptechreport.pdf Krishna P. Gummadi, Stefan Saroiu and Steven D. Gribble – U. of Washington 1. Motivation 2. Measurement Methodology • Lots of research and industrial excitement: • Chord [MIT], Tapestry,CAN [UCB], Jxta [SUN], Herald, Past [MSR], Publius [AT&T] • A distributed infrastructure largely comprised of voluntary, dynamic ad-hoc membership by peers. • Peers have symmetric roles (serving, downloading and routing) throughout system. • No knowledge regarding the fundamental characteristics of peers participating in the network • This knowledge can help in evaluating the effectiveness of different schemes. • Our measurements proceeded in three stages: • Periodically crawl Napster and Gnutella: • discover peers, IP’s. overlay topology, and whatever metadata about peers • Feed output from crawl into custom measurement tools: • measure bottleneck bandwidth to/from peers using SProbe. • measure IP latency to/from peers • track content and degree of sharing, where possible • Sub-sample population to measure lifetime: • Track availability of peers at application and IP level 3. Results How many peers have server-like behavior ? High upstream bandwidth ? Low latency ? High availability ? • Majority of the peers (>50%) connect through Cable or DSL modems. • On average, peers have low upstream bandwidths compared to downstream bandwidths, a feature more representative of a client than a server. • Large variation in IP level latencies. For a fraction of peers (~20%) transmission delay << latency, implying congestion. • Session durations strikingly similar in both systems. Median session: ~60 mins. Hence, content on a peer unlikely to be available without replication. What is the extent of free-riding ? How many peers lie about their bandwidth ? • A large fraction of peers (~25%) choose not to report their bandwidth, they are either unaware of it of have no incentive to report it. • Peers have an incentive to report lower bandwidths, a significant fraction do so. • Lack of knowledge is universal. • Modem (<64 Kbps) users share less files and do more downloads compared to broad band users. • Sharing less files: Top 7% of nodes share more than bottom 75% in Gnutella. 40-60% of peers in Napster share only 10-15% of files. 4. Conclusions 5. Future Work • Apply the results of these measurements to evaluate several proposed distributed index systems. • Analyze content life time patterns, and geographical distribution of the content and peers • Peers’ characteristics are very heterogeneous. A system should delegate responsibilities to its peers based on their characteristics. • The system should measure the characteristics of a peer rather than rely on self-reports from the peers themselves. 6. More Information