Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems Yinglian Xie Carnegie Mellon University Balachander Krishnamurthy, Jia Wang ATT Labs---Research

Motivation • Peer-to-peer(P2P) applications provide us with a new content service model • End-hosts self organized into an overlay network and share content with each other • For a wide deployment of P2P applications • We need a scalable content location and routing scheme in the application layer • We need to study and understand P2P traffic patterns

Recent Work • Existing approaches for content location • Napster: uses a centralized server • Gnutella: relies on flooding of queries • Recent designs • Distributed indexing schemes based on hash functions • CAN, Chord, Pastry, Tapestry

Our Work • A Cluster-based architecture (CAP) for P2P systems • Example application: distributed search (support keyword searching) • Design: using network-aware clustering • Early measurements of CAP • trace analysis + simulations

CAP System Design • Network-aware clustering • B. Krishnamurthy and J.Wang. On Network-Aware Clustering of Web Clients. In proceedings of ACM Sigcomm, August 2000 • An effective technique to group clients that are topologically close and under common administrative domain • Apply network-aware clustering to P2P applications • An additional level in the hierarchy • Less dynamism • More scalability

Clustering server delegate client CAP Architecture • Three entities • Clustering server • Delegate • Client • Two operations • Node join and node leave • Query lookup

Inter-cluster Routing • Each query has a maximum search depth • Each delegate keeps a neighbor list • Assigned randomly when the delegate joins the network • Updated gradually based on application requirements • Depth-first search among neighbors

CAP Evaluation • Collect Gnutella traces, apply network-aware clustering in trace data analysis • To examine the potential advantage of using network-aware clustering • Trace-driven simulations • Measure CAP system performance based on real deployment (ongoing work)

Collecting Gnutella Trace • A modified open source Gnutella client (gnut) to passively monitor and log all Gnutella messages Table 1 Traces with unlimited connections Table 2 Traces with limited connections

Cluster Distribution • CMU trace • 5/24/2001 – 5/25/2001, 799,386 IP addresses, 45,129 clusters • Clustering helps reduce query latency by caching repeated queries

Client and Cluster Distribution along Time • Network-aware clustering helps reduce dynamism in the P2P network

Simulation • Trace-driven simulation • Use Gnutella trace to generate “join, leave, search” • Assume the query distribution follows the file distribution • Performance metrics • Hit rate • Overhead • Search Latency

Hit Rate • Use CMU trace • 1,000 node stationary network • 311 clusters • 4,615search messages • 3,793 unique files

Overhead and Search Latency • Overhead • Messages per search, forward operations per delegate • In Gnutella, overhead grows exponentially • In CAP, overhead grows linearly • Search Latency • Application level hop length • In CAP, search path length is short

Summary • CAP is promising to increase stability and scalability of distributed applications Ongoing work: We are implementing CAP, deploying it in machines around the world, and measuring the performance

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems

Presentation Transcript

Architecture-Based Drivers for System-of-Systems and Family-of-Systems Cost Estimating

Early LHC Measurements

Architecture-based Evolution of Software Systems

Early Recovery Cluster

Middleware for P2P architecture

Introduction of P2P systems

P2P Systems for Worm Detection

A NOVEL SOCIAL CLUSTER-BASED P2P FRAMEWORK FOR INTEGRATING VANETS WITH THE INTERNET

P2P Architecture

A Physical Query Algebra for DHT-based P2P Systems

Early LHC Measurements

Comparing P2P Systems

P2P Database Systems

Architecture Cluster

A Plug-In Architecture for Graph Based Collaborative Modeling Systems

Search in P2P architecture

Pure P2P architecture

Early Scientific Measurements

A NOVEL SOCIAL CLUSTER-BASED P2P FRAMEWORK FOR INTEGRATING VANETS WITH THE INTERNET

Standards-Based P2P Communications Systems

A small world architecture for p2p networks