1 / 23

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload. K.P. Gummadi, R. J. Dunn, et al ACM SOSP’03 Presented by Min Choi(mchoi@camars.kaist.ac.kr). Outline. Trace methodology and analysis User characteristics Client activities Object dynamics

solada
Download Presentation

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload K.P. Gummadi, R. J. Dunn, et al ACM SOSP’03 Presented by Min Choi(mchoi@camars.kaist.ac.kr)

  2. Outline • Trace methodology and analysis • User characteristics • Client activities • Object dynamics • Analyze why Kazaa workload is not Zipf • A model of P2P file-sharing workloads • A study of bandwidth-saving techniques • Conclusion Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  3. Trace Methodology • Passively collect Kazaa traffic at the border of campus network and internet • Query traffic was not captured b/c of encryption. File transfers are HTTP transfers w/ Kazaa-specific header • Summary statistics of the trace: Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  4. Kazaa Users Are Patient • Transfer time: the difference between the start time and the end time of a request • Small objects: <10MB (mostly audio files) • Large objects: >100MB (typically video files) Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  5. User Slow Down As They Age • Do people become hungrier for content as they gain experience with Kazaa? • Older clients requested fewer bytes b/c: • Attrition: population declines as clients age • Slowing down: older clients ask for less Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  6. Client Activity • It’s difficult to quantify the availability of clients in a p2p system • Client activity includes: • Activity fraction: time spent in transfers / duration of lifetime. Lower bound on availability • Average session length: typical duration length Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  7. Object Characteristics • Kazaa is not one workload • Kazaa is a blend of workloads of different properties • 3 ranges of objects: small (<10MB), medium (10MB~100GB), and large (>100GB) • Majority of requests are for smaller objects • Most bytes transferred are due to large objects Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  8. Kazaa Object Dynamics • Multimedia objects are immutable, therefore affect object dynamics • Kazaa clients fetch objects at most once • Kazaa client requests an object once: 94% of time • Kazaa client requests an object twice: 99% of time • Most requests are for old (repeated) objects • An object is old if at least one month has passed since the first request of the object • 72% of requests for large objects are old • 52% of requests for small objects are old Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  9. Kazaa Object Dynamics • The popularity of Kazaa objects is often short-lived • The most popular pages remains stable for the Web • Popularity is fleeting in Kazaa • Audio files lose popularity faster than popular video files • The most popular Kazaa objects tend to be recently born objects • Newly born objects: did not receive any requests during the first month of the trace Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  10. Kazaa Is Not Zipf • Zipf’s law: • The popularity of ith-most popular object is proportional to i-α, α: Zipf coefficient • Kazaa is not Zipf • Most popular objects are less popular than Zipf would predict Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  11. Why Kazaa Is Not Zipf • Fetch-repeatly vs. fetch-at-most-once • Simulate the two cases based on the same Zipf distribution • The result of fetch-at-most-once is similar to Kazaa. • Non-Zipf workloads are also observed in web proxy caches and VoD servers Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  12. A Model of P2P File-Sharing Workloads • Hypothesis: underlying popularity of objects in a fetch-at-most-once system is driven by Zipf’s law • A client requests 2 objects per day. Choose which object to fetch from Zipf(1) • An object is born with rate λo , its popularity rank is selected from Zipf(1) • Total object population cannot be observed from the trace. Use back-inference: given 18,000 distinct objects are requested in the trace, what’s the total number of objects? Ans: 40,000 Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  13. Model Structure and Notation • Parameter value are chosen to reflect the measured data from the trace Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  14. File-Sharing Effectiveness • How should organization exepect bandwidth demand to change over time, given a shared proxy server? • Hit rate of the proxy cache decreases in the fetch-at-most-once case • Fetch-at-most-once clients consume the most popular objects early Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  15. New Object Arrivals Improve Hit Rate • Object updates in Web lower the hit rate • New objects arrivals are beneficial in P2P system • Arrivals of popular objects increase hit rate • If no arrivals, clients are forced to choose from the remaining unpopular objects Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  16. New Clients Cannot Stabilize Performance • The infusion of new clients at a constant rate cannot compensate for the increasing number of old clients • If we want to keep hit rate as a constant, we need exponential client arrival rate Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  17. Model Validation • Underlying Zipf assumption cannot be validated directly. • Use the proposed model to replicate the object popularity distribution in the trace • Estimate various parameters • Arrival rate of new objects is chosen to fit the measured data. λo = 5,475 objects per year Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  18. Exploring Locality-aware Request Routing • A significant fraction of Internet bandwidth is consumed by Kazaa • How would exploitation of locality help to save bandwidth? • Different ways to exploit locality: • A centralized proxy cache placed at organization border • Request redirection: favor organization-internal peers • Centralized request redirection • Decentralized request redirection Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  19. An Ideal Proxy Cache • Assume an ideal proxy: infinite capacity and bandwidth • 86% of external bandwidth would be saved • However, some may not want to store P2P file-sharing content in a proxy server due to legal issues Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  20. Benefits of Locality-Awareness • Trace-based simulation • Infinite storage capacity • At most 12 concurrent downloads • Upload bandwidth 500 Kb/s • External bandwidth 100 Kb/s • Clients are available only when they’re transferring (a very conservative assumption) • Cold misses: objects cannot be found in peers • Busy misses: objects found but the peer is unavailable due to concurrent transfers Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  21. Benefits of Locality-Awareness • Locality awareness obtained 68% byte hit rate for large objects and 37% byte hit rate for small objects • A substantial number of miss bytes (62% of large objects, 43% of small objects) are due to unavailable clients Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  22. Benefits of Increased Availability • Most of bytes served and consumed come from highly available peers • Adding availability to the most available hosts earns a higher hit rate than adding to the least available host Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

  23. Conclusion • P2P file-sharing workloads are different to Web workloads • User are patient • Aged clients demand less • Fetch-at-most once • The proposed model suggests that client births and object births are the fundamental forces driving P2P workloads • There’s significant locality in the Kazaa workload • Locality-aware peers would save 63% external transfers even under conservation assumption Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

More Related