1 / 22

Study on Network Size Estimation Schemes for Peer-to-Peer Networks

Study on Network Size Estimation Schemes for Peer-to-Peer Networks. 2008/02/19 Hosik Cho hscho@mmlab.snu.ac.kr. Some Questions. How many people in this room? Why do you think that? How many people in this campus? Can you count them all? How many nodes in a P2P network over the world?.

candy
Download Presentation

Study on Network Size Estimation Schemes for Peer-to-Peer Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Study on Network Size Estimation Schemes for Peer-to-Peer Networks 2008/02/19 Hosik Cho hscho@mmlab.snu.ac.kr

  2. Some Questions • How many people in this room? • Why do you think that? • How many people in this campus? • Can you count them all? • How many nodes in a P2P network over the world?

  3. Contents • Peer to Peer networks • Network size estimation • Estimation methods • Unstructured P2P • Structured P2P • Conclusion

  4. P2P networks • A peer to peer overlay network connects peers in a logical manner on top of IP. • Unstructured P2P: Gnutella, Freenet • Structured P2P: Chord, CAN, Pastry, … • P2P applications • File sharing systems (Kazza, Gnutella) • Video over IP (CoolStreaming) • Voice over IP (Skype)

  5. P2P networks • Characteristics • Scalable • Self-organizing capability • Resilience to failure • Fully decentralized • The system monitoring and obtaining global statistics become much more complex.

  6. Network size estimation • Network size (N) • Load balancing • Restricted broadcasting • Determining network parameters • For unstructured P2P network, most approaches are based on broadcasting. • For structured P2P network, the size can be directly inferred from the density of identifiers.

  7. Related Works • Unstructured P2P • Sample & Collide • Hops Sampling • Gossip-based aggregation • Structured P2P • Token passing • Neighbor sampling • Finger sampling

  8. Sample&Collide (1) • “Birthday Paradox” – The probability of having two people in a room that have the same birthday is at least 50%, for a group of 23 peoples. • The initiator samples nodes uniformly at random until a sample returns a node that already has been selected. • The expected number (X) of samples is √2n • The system size is estimated to X2/2

  9. T Sample&Collide (2) • Initiator node set T>0 • Send to neighbors • Nodes picks a random number U, and decrements T by log(U)/di • T>0, forwards the message • T<0, return its ID to the initiator (sample)

  10. HopsSampling (1) • Probabilistic polling approach • An initiator spreads messages in the network and estimates the system size based on the replies it gets back. • If hopCount < minHopsReporting, a response is set with prob. 1 • Else, the response is sent with prob. 1/2(hopCount-minHopsReporting) • If minHopsReporting=2, only 25% of nodes with distance 4 will report back.

  11. HopsSampling (2) • Initiator node set hopCount=0 • Send to neighbors • If hopCount < minHopsReport, send response • Else, send response with probability depending on hopCount.

  12. Gossip-based (1) • Epidemic-based approach • If exactly one node of the system holds a value 1, and all the other values are 0, the average is 1/N. • An initiator take the value 1, and start gossiping. • The reached nodes participate to the process by setting their value to 0. • At each cycle, each node in the network chooses one of its neighbor and swaps its estimation parameter.

  13. Gossip-based (2) • Estimation  (Estimation+neighbor’s_Estimation)/2 • To provide correct estimations, this algorithm needs to wait a certain number of rounds to elapse before computing the size estimation. • This period is the required time for the gossip to propagate in the whole network and for the values to converge.

  14. N Estimation in S-P2P • Assumptions • IDs are uniformly distributed. • Each node knows the total number of nodes (N) in the system. • Nodes do not leave and join frequently.

  15. Basic approaches 7 4 Token 5 (a) Token passing (b) Neighbor sampling

  16. N Estimation in S-P2P • In actual deployed system, • Nodes join and leave frequently. • Node must estimate the time how long a query delivered to the destination. O(logN) • Proximity-based identifiers are adopted for efficient routing. • AS number • geographic location

  17. Uniformity of Identifiers Myth Real

  18. Estimation result (1) Uniformly distributed IDs Proximity ID’s

  19. Extended approach • Structured P2P maintains fingers, routing tables, contacts, etc. • Estimate N more precisely using structural information.

  20. Estimation result (2) Uniformly distributed IDs Proximity ID’s

  21. Conclusion • For unstructured P2P • Tradeoff between the quality of the estimate and the associated overhead. • A proper algorithm should be applied according to its objectives and applications. • For structured P2P • Distribution of identifiers may be skewed. • Use of structural information will make the estimation results more accurate.

  22. References • D. Psaltoulis, D. Kostoulas, I. Gupta, K. Birman, and A. Demers, “Practical algorithms for size estimation in large and dynamic groups,” PODC 2004. • D. Kostoulas, D. Psaltoulis, I. Gupta, K. Birman, and A. Demers, “Decentralized schemes for size estimation in large and dynamic group,” IEEE NCA’05, 2005. • L. Massoulie, A.-M. Kermarrec, E. Le Merrer, and A.J. Ganesh, “Peer couting and sampling in overlay networks: random walk methods,” Technical report MSR-TR-2005-156, 2005. • G.S. Manku, M. Bawa, and P. Raghavan, “Symphony: Distributed Hashing in a Small World,” USITS 2003.

More Related