School of Computing Science Simon Fraser University

School of Computing ScienceSimon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 17 January 2005

Announcements • Initial round, you have four slots • Jan 24, Jan 26, Jan 31, and Feb 2 • Ashok will present the paper on Jan 26 (Wednesday) • Need a volunteer to present the Pastry paper next Monday (an easy one) • Did you read the survey paper? Please do.

Last Lectures • P2P is an active research area with many potential applications in industry and academia • In P2P computing paradigm • Peers cooperate to achieve desired functions • Simple model for P2P systems • Peers form an abstract layer called overlay • Peer software architecture model may have three components 

P2P Application Middleware P2P Substrate Operating System Hardware System architecture: Peers form an overlay according to the P2P Substrate Software architecture model on a peer P2P Systems: Simple Model

P2P Application Middleware P2P Substrate Operating System Hardware Software architecture model on a peer Peer Software Architecture Model • A software client installed on each peer • Three components: • P2P Substrate • Middleware • P2P Application

P2P Substrate • A key component, which • Manages the Overlay • Allocates and discovers objects • P2P Substrates can be • Structured • Unstructured • Based on the flexibility of placing objects at peers

Structured P2P Substrates • Objects are rigidly assigned to peers • Objects and peers have IDs (usually by hashing some attributes) • Objects are assigned to peers based on IDs • Peers in the overlay form a specific geometrical shape, e.g., tree, ring, hypercube, butterfly network • The shape (to some extent) determines • How neighbors are chosen, and • How messages are routed

Structured P2P Substrates (cont’d) • The substrate provides a Distributed Hash Table (DHT)-like interface • InsertObject (key, value), findObject (key), … • In the literature, many authors refer to structured P2P substrates as DHTs • It also provides peer management (join, leave, fail) operations • Most of these operations are done in O(log n) steps, n is number of peers

Structured P2P Substrates (cont’d) • DHTs: Efficient search & guarantee of finding • However, • Lack of partial name and keyword queries • Maintenance overhead, even O(log n) may be too much in very dynamic environments • Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)

Example: Content Addressable Network (CAN)[Ratnasamy 01] • Nodes form an overlay in d-dimensional space • Node IDs are chosen randomly from the d-space • Object IDs (keys) are chosen from the same d-space • Space is dynamically partitioned into zones • Each node owns a zone • Zones are split and merged as nodes join and leave • Each node stores • The portion of the hash table that belongs to its zone • Information about its immediate neighbors in the d-space

n2 n1 n4 n5 n3 2-d CAN: Dynamic Space Division 7 0 0 7

7 K1 n2 K2 n1 n4 n5 K4 n3 K3 0 0 7 2-d CAN: Key Assignment

7 n2 n1 K4? K4? n4 n5 n3 0 0 7 2-d CAN: Routing (Lookup) K1 K2 K4 K3

CAN: Routing • Nodes keep 2d = O(d) state information (neighbor coordinates, IPs) • Constant, does not depend on number of nodes n • Greedy routing • Route to the node that is closest to the destination • On average, is done in O(n1/d) = O(log n) when d = log n /2

CAN: Node Join • New node finds a node already in the CAN • (bootstrap: one (or a few) dedicated nodes outside the CAN maintain a partial list of active nodes) • It finds a node whose zone will be split • Choose a random point P • Forward a JOIN request to P through the existing node • The node that owns P splits its zone and sends half of its routing table to the new node • Neighbors of the split zone are notified

CAN: Node Leave, Fail • Graceful departure • The leaving node hands over its zone to one of its neighbors • Failure • Detected by the absence of heart beat messages sent periodically in regular operation • Neighbors initiate takeover timers, proportional to the volume of their zones • The neighbor with the smallest timer takes over the zone of dead node, and notifies other neighbors so they cancel their timers (some negotiation between neighbors may occur) • Note: the (key, value) entries stored at the failed node are lost • Nodes that insert (key, value) pairs periodically refresh (or re-insert) them

CAN: Discussion • Scalable • O(log n) steps for operations • State information is O(d) at each node • Locality • Nodes are neighbors in the overlay, not in the physical network • Suggestion (for better routing) • Each node measure RTT between itself and its neighbors • Forward the request to the neighbor with maximum ratio of progress to RTT • Maintenance cost • Logarithmic • But, may still be too much for very dynamic P2P systems

P2P Substrate • A key component, which • Manages the Overlay • Allocates and discovers objects • P2P Substrates can be • Structured • Unstructured • Based on the flexibility of placing objects at peers

Unstructured P2P Substrates • Objects can be anywhere  Loosely-controlled overlays • The loose control • Makes the overlay tolerate transient behavior of nodes • When a peer leaves for example, nothing needs to be done because there is no structure to restore • Enables the system to support flexible search queries • Queries are sent in plain text and every node runs a mini-database engine • But, we loose on searching • Usually using flooding, inefficient • Some heuristics exist to enhance performance • No guarantee on locating a requested object (e.g., rarely requested objects) • Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03]

Example: Gnutella • Peers are called servents • All peers form an unstructured overlay • Peer join • Find an active peer already in Gnutella (contact e.g., Gnutella hosts) • Send a Ping message through the active peer • Peers willing to accept new neighbors reply with Pong • Peer leave, fail • Just drop out of the network! • To search for a file • Send a Query message to all neighbors with a TTL (=7) • Upon receiving a Query message • Check local database and reply with a QueryHit to the requester • Decrement TTL and forward to all neighbors of nonzero

Flooding in Gnutella Scalability Problem

Heuristics for Searching [Yang and Garcia-Molina 02] • Iterative deepening • Multiple BFS with increasing TTLs • Reduce traffic but increase response time • Directed BFS • Send to “good” neighbors (subset of your neighbors that returned many results in the past)  need to keep history • Local Indices • Keep a small index over files stored on neighbors (within number of hops) • May answer queries on behalf of them • Save cost of sending queries over the network • Index currency?

Heuristics for Searching: Super Node • Used in Kazaa (signaling protocols are encrypted) • Studied in [Chawathe 03] • Relatively powerful nodes play special role • maintain indexes over other peers

Unstructured Substrates with Super Nodes Super Node (SN) Ordinary Node (ON)

Example: FastTrack Networks (Kazaa) • Most of the info/plots in following slides are from Understanding Kazaa by Liang et al. • By far, the most popular (~ 3 million active users in a typical day) sharing 5,000 Terabytes • Kazaa traffic exceeds Web traffic • Two-tier architecture (with Super Nodes and Ordinary Nodes) • SN maintain an index on files stored at ONs attached to it • ON reports to SN the following metadata on each file: • File name, file size, ContentHash, file descriptors (artist name, album name, …)

FastTrack Networks (cont’d) • Mainly two types of traffic • Signaling • Handshaking, connection establishment, uploading metadata, … • Encrypted! (some reverse engineering efforts) • Over TCP connections between SN—SN and SN—ON • Analyzed in [Liang et al. 04] • Content traffic • Files exchanged, not encrypted • All through HTTP between ON—ON • Detailed Analysis in [Gummadi et al. 03]

Kazaa (cont’d) • File search • ON sends a query to its SN • SN replies with a list of IPs of ONs that have the file • SN may forward the query to other SNs • Parallel downloads take place between supplying ONs and receiving ON

FastTrack Networks (cont’d) • Measurement study of Liang et al. • Hook three machines to Kazaa and wait till one of them is promoted to be SN • Connect the other two (ONs) to that SN • Study several properties • Topology structure and dynamics • Neighbor selection • Super node lifetime • ….

Kazaa: Topology Structure [Liang et al. 04] ON to SN: 100 - 160 connections  Since there are ~3M nodes, we have ~30,000 SNs SN to SN: 30 – 50 connections  Each SN connects to ~0.1 % of total number of SNs

Kazaa: Topology Dynamics [Liang et al. 04] • Average ON – SN connection duration • Is ~ 1 hour, after removing very short-lived connections (30 sec) used for shopping for SNs • Average SN – SN connection duration • 23 min, which is short because of • Connection shuffling between SNs to allow ONs to reach a larger set of objects • SNs search for other SNs with smaller loads • SNs connect to each other from time to time to exchange SN lists (each SN stores 200 other SNs in its cache)

Kazaa: Neighbor Selection [Liang et al. 04] • When ON first joins, it get a list of 200 SNs • ON considers locality and SN workload in selecting its future SN • Locality • 40% of ON-SN connections have RTT < 5 msec • 60% of ON-SN connections have RTT < 50 msec • RTT: E. US  Europe ~100 msec

Kazaa: Lifetime and Signaling Overhead [Liang et al. 04] • Super node average lifetime is ~2.5 hours • Overhead: • 161 Kb/s upstream • 191 Kb/s downstream •  Most of SNs are high-speed (campus network, or cable)

Kazaa vs. Firewalls, NAT[Liang et al. 04] • Default port WAS 1214 • Easy for firewalls to filter out Kazaa traffic • Now, Kazaa uses dynamic ports • Each peer chooses its random port • ON reports its port to its SN • Ports of SNs are part of the SN refresh list exchanged among peers • Too bad for firewalls! • Network Address Translator (NAT) • A requesting peer can not establish a direct connection with a serving peer behind NAT • Sol: connection reversal • Send to SN of the NATed peer, which already has a connection with it • SN tells the NATed peer to establish a connection with the requesting peer! • Transfer occurs happily through the NAT

Kazaa: Lessons [Liang et al. 04] • Distributed design • Exploit heterogeneity • Load balancing • Locality in neighbor selection • Connection Shuffling • If a peer searches for a file and does not find it, it may try later and gets it! • Efficient gossiping algorithms • To learn about other SNs and perform shuffling • Kazaa uses a “freshness” field in SN refresh list  a peer ignores stale data • Consider peers behind NATs and Firewalls • They are everywhere!

Project Discussion • Refer to the handout

Papers Flash Overview • Refer to the Course Reading List web page

School of Computing Science Simon Fraser University