1 / 18

Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks

Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks. Daniel Stutzbach, Reza Rejaie University of Oregon Subhabrata Sen AT&T Labs. IEEE Computer & Communications Workshop, Huntington Beach October 25 th , 2005. Motivation.

Download Presentation

Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks Daniel Stutzbach, Reza Rejaie University of Oregon Subhabrata Sen AT&T Labs IEEE Computer & Communications Workshop, Huntington Beach October 25th, 2005

  2. Motivation • P2P file-sharing systems are very popular in practice. • Several million simultaneous users collectively. • 60% of all Internet traffic [CacheLogic Research 2005] • Most use an unstructured overlay. • Understanding overlay properties & dynamics is important: • Understanding how existing P2P systems function • Developing and evaluating new systems • Unstructured overlays are not well-understood. • We characterized overlay topology in Gnutella because • Size: one of the largest P2P systems; more than 1 million users • Mature: In use for several years; older studies for comparisons • Open: No reverse-engineering needed http://mirage.cs.uoregon.edu/P2P

  3. Defining the Problem Ultrapeer • Gnutella uses a two-tier overlay. • Improves scalability. • Ultrapeers form an unstructured mesh. • Leaf peers connect to the ultrapeers. • eDonkey, FastTrack are similar. • Studying the overlay requires snapshots. • Snapshots capture the overlay as a graph. • Individual snapshots reveal graph properties. • Consecutive snapshots reveal dynamics. • However, capturing accurate snapshots is difficult. Top-level overlay Leaf http://mirage.cs.uoregon.edu/P2P

  4. Challenges in Capturing Accurate Snapshots • Snapshots are captured iteratively by a crawler. • An ideal snapshot is instantaneous. • But the overlay is large and rapidly changing. • Captured snapshots are likely to be distorted. • Previous studies captured either • Complete snapshots with slow crawler => distorted • Partial snapshots => less distorted, but unrepresentative • Some types of analysis require the whole graph. • Increasing crawler speed reduces distortion in captured snapshots. http://mirage.cs.uoregon.edu/P2P

  5. Cruiser: a Fast Gnutella Crawler • Features: • Distributed, highly parallelized implementation • Dynamic adaptation to bandwidth & CPU constraints • Cruiser is orders of magnitude faster than other P2P crawlers: • Captures one million nodes in around 7 minutes • 140,000 peers/min, compared to 2,500 peers/min [Saroiu 02] • We investigated the effects of speed on distortion. • 4% node distortion and 15% edge distortion • Daniel Stutzbach and Reza Rejaie, “Capturing Accurate Snapshots of the Gnutella Network”, the Global Internet Symposium, March, 2005. http://mirage.cs.uoregon.edu/P2P

  6. Data Set • More than 80,000 snapshots, over the past year. • To examine static properties, we focus on four: • To examine dynamic properties, we use slices: • Each slice is 2 days of ~500 back-to-back snapshots • Captured starting 10/14/04, 10/21/04, 11/25/04, 12/21/04, and 12/27/04 http://mirage.cs.uoregon.edu/P2P

  7. Graph Properties Implementation heterogeneity Degree Distribution: Top-level degree distribution Ultrapeer-leaf connectivity Degree-distance correlation Reachability: Path lengths Eccentricity Small world properties Resiliency Dynamic Properties Existence of stable core: Uptime distribution Biased connectivity Properties of stable core: Largest connected component Path lengths Clustering coefficient Summary of Characterizations http://mirage.cs.uoregon.edu/P2P

  8. Top-level Degree Max 30 in most clients • This is the degree distribution among ultrapeers. • There are obvious peaks at 30 and 70 neighbors. • A substantial number of ultrapeers have fewer than 30. • What happened to the power-law reported by prior studies? Max 75 in some clients Custom http://mirage.cs.uoregon.edu/P2P

  9. What happened to power-law? • When a crawl is slow, many short-lived peers report long-lived peers as neighbors. • But those neighbors are not all present at the same time. • Degree distribution from a slow crawl resembles prior results. [Ripeanu 02 ICJ] http://mirage.cs.uoregon.edu/P2P

  10. Shortest-Path Distances • Distribution of distances among ultrapeers (left) • 70% of distances are exactly 4 hops. • Distribution of distances among all peers (right) • Most distances are 5 or 6 hops. • Shows the effect of the two-tier with multiple parents • Despite large size, pair-wise distances are short. http://mirage.cs.uoregon.edu/P2P

  11. Is Gnutella a Small World? • Small worlds arise naturally in many places. • Movies actors, power grid, co-authors of papers • Small world graphs have short distances, but significant clustering, compared to a similar random graph. • Gnutella is a small world. • Very high clustering adversely affects flooding queries. • But Gnutella isn’t too clustered to affect performance. http://mirage.cs.uoregon.edu/P2P

  12. Random Highest degree first Resiliency to Node Failure • Ratio of connected peers after node failure. • The Gnutella topology is extremely resilient to random node failure. • It’s resilient even when the highest-degree nodes are removed. • Complex algorithms are not necessary to achieve resiliency. http://mirage.cs.uoregon.edu/P2P

  13. Dynamic Properties • How does node churn affect overlay dynamics? • Are some “regions” of the overlay more stable? • How can we identify such a region? • Methodology: • Capture a long series of back-to-back snapshots • Estimate the uptime of individual peers in the last snapshot • Group peers with uptime higher than a threshold • Examine biased connectivity within each group Present for 5 snapshots Present for 2 snapshots Departed peer Newly arrived peer http://mirage.cs.uoregon.edu/P2P Time

  14. Stable Core • Most peers have a short uptime. • Other peers have been around for a long time. • Stable core: a set of peers with uptime higher than a threshold (T). • Higher threshold => more stable group of peers T > 20 h T > 10 h http://mirage.cs.uoregon.edu/P2P

  15. Biased Connectivity • Hypothesis: long-lived nodes tend to be more connected to other long-lived nodes • Rationale: Once connected, they stay connected. • Long-lived peers have more opportunities to become neighbor. • To quantify bias in the connectivity of the stable core: • Randomize the edges to create a graph without biased connectivity. • Compare the edges in the observed stable core with the randomized graph. http://mirage.cs.uoregon.edu/P2P

  16. Stable Core Edges • 20%—40% more edges in the stable core compared to random. • Connectivity exhibits an onion-like biased connectivity where peers are more likely to connect to other peers with same/higher uptime. • We examined other properties of the stable core. • Despite high churn, there is a relatively stable “backbone”. http://mirage.cs.uoregon.edu/P2P

  17. Summary • Characterizations of Gnutella overlay based on recent and accurate snapshots. • Graph properties: • The degree distribution in Gnutella is not power law. • Gnutella exhibits small world characteristics. • Gnutella is resilient. • Dynamic properties: • There is a stable core within the overlay topology. • Peer churn causes the stable core to exhibit an onion-like biased connectivity. • This effect is likely to occur in other unstructured P2P systems. • Daniel Stutzbach, Reza Rejaie, Subhabrata Sen, “Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems”, Internet Measurement Conference, Berkeley, 2005 http://mirage.cs.uoregon.edu/P2P

  18. Future Work • Examining underlying causes of the biased connectivity. • Exploring long-term trends in overlay properties. • Characterizing churn • Characterizing properties of other widely-deployed P2P systems • Kad (a DHT with more than 1 million users) • BitTorrent • Developing sampling techniques for P2P http://mirage.cs.uoregon.edu/P2P

More Related