1 / 15

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems. Yinglian Xie Carnegie Mellon University Balachander Krishnamurthy, Jia Wang ATT Labs---Research. Motivation. Peer-to-peer(P2P) applications provide us with a new content service model

ntimothy
Download Presentation

Early Measurements of a Cluster-based Architecture for P2P Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Early Measurements of a Cluster-based Architecture for P2P Systems Yinglian Xie Carnegie Mellon University Balachander Krishnamurthy, Jia Wang ATT Labs---Research

  2. Motivation • Peer-to-peer(P2P) applications provide us with a new content service model • End-hosts self organized into an overlay network and share content with each other • For a wide deployment of P2P applications • We need a scalable content location and routing scheme in the application layer • We need to study and understand P2P traffic patterns

  3. Recent Work • Existing approaches for content location • Napster: uses a centralized server • Gnutella: relies on flooding of queries • Recent designs • Distributed indexing schemes based on hash functions • CAN, Chord, Pastry, Tapestry

  4. Our Work • A Cluster-based architecture (CAP) for P2P systems • Example application: distributed search (support keyword searching) • Design: using network-aware clustering • Early measurements of CAP • trace analysis + simulations

  5. CAP System Design • Network-aware clustering • B. Krishnamurthy and J.Wang. On Network-Aware Clustering of Web Clients. In proceedings of ACM Sigcomm, August 2000 • An effective technique to group clients that are topologically close and under common administrative domain • Apply network-aware clustering to P2P applications • An additional level in the hierarchy • Less dynamism • More scalability

  6. Clustering server delegate client CAP Architecture • Three entities • Clustering server • Delegate • Client • Two operations • Node join and node leave • Query lookup

  7. Inter-cluster Routing • Each query has a maximum search depth • Each delegate keeps a neighbor list • Assigned randomly when the delegate joins the network • Updated gradually based on application requirements • Depth-first search among neighbors

  8. CAP Evaluation • Collect Gnutella traces, apply network-aware clustering in trace data analysis • To examine the potential advantage of using network-aware clustering • Trace-driven simulations • Measure CAP system performance based on real deployment (ongoing work)

  9. Collecting Gnutella Trace • A modified open source Gnutella client (gnut) to passively monitor and log all Gnutella messages Table 1 Traces with unlimited connections Table 2 Traces with limited connections

  10. Cluster Distribution • CMU trace • 5/24/2001 – 5/25/2001, 799,386 IP addresses, 45,129 clusters • Clustering helps reduce query latency by caching repeated queries

  11. Client and Cluster Distribution along Time • Network-aware clustering helps reduce dynamism in the P2P network

  12. Simulation • Trace-driven simulation • Use Gnutella trace to generate “join, leave, search” • Assume the query distribution follows the file distribution • Performance metrics • Hit rate • Overhead • Search Latency

  13. Hit Rate • Use CMU trace • 1,000 node stationary network • 311 clusters • 4,615search messages • 3,793 unique files

  14. Overhead and Search Latency • Overhead • Messages per search, forward operations per delegate • In Gnutella, overhead grows exponentially • In CAP, overhead grows linearly • Search Latency • Application level hop length • In CAP, search path length is short

  15. Summary • CAP is promising to increase stability and scalability of distributed applications Ongoing work: We are implementing CAP, deploying it in machines around the world, and measuring the performance

More Related