370 likes | 484 Views
Differentiated Admission for Peer-to-Peer Systems. å³ä¿Šèˆˆ é«˜é›„å¤§å¸ è³‡è¨Šå·¥ç¨‹å¸ç³» 助ç†æ•™æŽˆ ( ä¸å¤®ç ”究院 資訊科å¸ç ”究所 åˆè˜åŠ©ç ”究員 ). wuch@nuk.edu.tw November 17, 2004. Joint Work with Prof. HT Kung, Harvard University, USA. Outline. I. Introduction to P2P II . Motivation and Main Ideas
E N D
Differentiated Admission forPeer-to-Peer Systems 吳俊興 高雄大學 資訊工程學系 助理教授 (中央研究院 資訊科學研究所 合聘助研究員) wuch@nuk.edu.tw November 17, 2004 Joint Work with Prof. HT Kung, Harvard University, USA
Outline I. Introduction to P2P II. Motivation and Main Ideas • Freeloader Problem and Differentiated Admission III. System Design • Reputation-based, differentiated admission control • Eigenvector-based reputation system • Adaptation system of willingness-to-serve • Sampling technique • Distributed trust-enhancement scheme IV. Summary and Concluding Remarks
Peer-to-peer (P2P) Model “Peer-to-Peer (P2P) is a way of structuring distributed applications such that the individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications a node may act as both a client and a server.” Excerpt from the Charter of Peer-to-Peer Research Group, IETF/IRTF, June 24, 2003 http://www.irtf.org/charters/p2prg.html Heterogeneous and Autonomous I. Introduction to P2P
Request Server Clients Service Client-server Model Clients and servers each with distinct roles U11 U12 S U13 U21 • The server and the network become the bottlenecks and points of failure • DDoS • Flash Crowd U22 U32 U31 I. Introduction to P2P
Content Distribution Networks U11 U12 S CR1 U13 CRa CRb Overlay Network U21 CR2 U22 Hosting + Hierarchical Proxies+ DNS Request Routing (Akamai, CacheFlow, etc.) CR3 Content Router or Peer Node CR U32 U31 Name: www2.microsoft.akadns.net (www.microsoft.com.nsatc.net) Addresses: 207.46.249.221, 207.46.134.221, 207.46.249.189, 207.46.134.189, … Aliases: www.microsoft.com, www.microsoft.akadns.net Name: www.yahoo.akadns.net Addresses: 66.218.71.93, 66.218.70.50, 66.218.71.91, 66.218.71.86, 66.218.71.89, … Aliases: www.yahoo.com Name: www.google.akadns.net Addresses: 66.102.11.104, 66.102.11.99 Aliases: www.google.com I. Introduction to P2P
Content Distribution Networks (Cont.) “Serverless ICP” (Fabless IC Design House) ICPs ICPs ParadigmShift Content NetworkProviders (CNPs) “Content Foundry” (Semiconductor Foundry) ISPs ISPs I.C. =Internet Contents ICPs need contents, servers, equipments, IT experts, etc. Now, ICPs need only contents and trusted CNPs. I. Introduction to P2P
P2P Users U13 Content Search Content Transfer Napster U11 U12 DS U13 U21 U22 sharing MP3 files U32 U31 Any two P2P nodes can exchange contents directly + Running P2P software is much easier than maintaining servers + Reduce the load of server, even without servers I. Introduction to P2P
Paradigm Shift of Computing System Models 1980~ Terminal-Mainframe (Super-computing) 1990~ Client-Server (Micro-computing /Personal Computer) 2000~ Peer-to-Peer (Macro-computing) ADSL/100M+ Ethernet RS-232 Dialup/10M Ethernet Linux/Windows 2K Windows 31/95 VT100/DOS I. Introduction to P2P
P2P Applications • P2P File Swapping (Sharing) • Napster, FreeNet, Gnutella, KaZaA, eDonkey, EZPeer, Kuro • P2P Communication • NetNews (NNTP), Instant Messaging (IM), Skype • P2P Lookup Services and Their Applications (Distributed Hash Tables and Global Repositories) • IRIS, Chord/CFS, Tapestry/OceanStore, Pastry/PAST, CAN • P2P Overlay Networking • (Inter-domain Routing – BGP), RON, PDF, Detour, LRR • P2P Multimedia Streaming (Application-layer Multicast) • CoopNet, Zigzag, Narada, P2Cast, Streamlla • Proxies and Content Distribution Networks • Squid, Akamai, DigitIsland, Amplicast • Overlay Testbed • PlanetLab, NetBed/EmuLab • Other Areas • P2P Gaming, Grid Computing I. Introduction to P2P
Most Popular Titles in Windows • From CNet’s download.com,Week Ending Jan 25, 2004 • Kazaa is the Top 1 title in the history, more than 2M downloads a week. • Top downloads in SourceForge: eMule, BitTorrent • Taiwan: Kuro, EzPeer, eDonkey, … I. Introduction to P2P - Status
Example of P2P File Swapping: KaZaA As of July 2003 • 77% of surveyed companies had at least one installation of file-swapping software (AssetMetrix, July 16, 2003) • Today more KaZaA traffic than Web traffic! I. Introduction to P2P - Status
Skype – P2P VoIP Background • Created by Niklas Zennstrom and Janus Friis, founders of KaZaA • Beta launched August 29, 2003 • As of 16 Nov 2004: 34,642,017 downloads and 2,551,163,700 Minutes served (1,771,641 days; 4853 years) • Uses 3-16 KB/s while calling and 0-0.5 KB/s while idle Features • High-quality Internet conference calls • Firewall and NAT traversal • Global decentralized user directory • Intelligent routing for encrypted calls through effective path • Encrypting all calls and instant messages end-to-end
Directory P P Example of Centralized P2P Systems:Napster • Announced in January 1999 by Shawn Fanning for sharing MP3 files and pulled plug in July 2001 • Centralized server for search, direct file transfer among peer nodes • Proprietary client-server protocol and client-client protocol • Relying on the user to choose a ‘best’ source • Disruptive, proof of concepts • IPR and firewall issues I. Introduction to P2P – Basic Models
Example of Decentralized P2P Systems: Gnutella • Open source • 3/14/2000: Released by NullSoft/AOL, almost immediately withdrawn, and became open source • Message flooding: serverless, decentralized search by message broadcast, direct file transfer using HTTP • Limited-scope query 2 2 2 2 1 1 3 Client 2 C S 1 1 Server 3 Servent (=Server + Cli ent) I. Introduction to P2P – Basic Models
Example of Unstructured P2P Systems: Freenet • Ian Clarke, Scotland, 2000 • Distributed depth-first search, Exhaustive search • File hash key, lexicographically closest match • Store-and-forward file transfer • Anonymity • Open source 5 6 C S 3 2 1 4 I. Introduction to P2P – Basic Models
Supernodes inoverlay mesh C S Example of Hybrid P2P Systems: FastTrack / KaZaA • Proprietary software developed by FastTrack in Amsterdam and licensed to many companies • Summer 2001, Sharman networks, founded in Vanuatu, acquires FastTrack • Hierarchical supernodes (Ultra-peers) • Dedicated authentication server and supernode list server • From user’s perspective, it’s like Google. • Encrypted files and control data transported using HTTP • Parallel download • Automatically switch to new server I. Introduction to P2P – Basic Models
0 1 15 14 2 13 3 12 4 11 5 10 6 9 7 8 Example of Structured P2P Systems:Chord • Frans Kaashoek, et. al., MIT, 2001 • IRIS: Infrastructure for Resilient Internet Systems, 2003 • Distributed Hashing Table • Scalable lookup service • Hyper-cubic structure I. Introduction to P2P – Basic Models
Challenges to P2P Systems In an ideal P2P system • Each node contributes as he can and consumes as he need • It cooperates to perform much more stably, securely and efficiently than any single one computer In a real open P2P system • There are many nodes that consume many more resources than they contribute • Some nodes or groups of nodes who may attack or cheat the other normal nodes of the system These may probably make the system malfunctioned, unfair or inefficient II. Freeloader Problem
P1 P8 P2 P3 P7 P6 P4 P5 Fair Contributions of Resourcesby Participating Nodes 8-node transaction example • P5 is likely a freeloader because he consumes much more resources than he contributes • P6 is very nice Fairness is a well-known concern in P2P communities II. Freeloader Problem
Peer Selection Issues In an open P2P, it happens often that Server Selection a requesting peer (client) needs to decide which servers it should request service from Client Selection a supplying peer (server) needs to decide which clients it should grant requests first I want P&Q I have X A A B B I want Z I have X I want Y I have X D D C C so many requests S I want X I want X
Related Works • Access Control List • White list and black list • Payment-based Approaches • Mojo Notion • Reputation-based Approaches • Free Haven, Reputella, NodeRank, PeerTrust, EigenTrust, PeerRank II. Freeloader Problem
P1 P8 P2 P5 ← P6 P5 ← P6← P7 P5 ← P6← P7←P8 P5 ← P4 P5 ← P4← P3 P5 ← P4← P3←P2 P3 P7 P4 P6 P4 ← P3←P2 ←P1 ←P8← P7← P6 P3 ←P2 ←P1 ←P8← P7 and P1 ←P5 P5 P2 P3 P4 P5 P6 P7 P8 P1 P1 0 0 1 1 0 0 0 0 P2 0 0 1 1 2 0 0 0 0 0 0 1 3 0 0 0 P3 0 0 0 0 4 0 0 0 P4 2 0 0 0 0 0 0 0 P5 0 0 0 2 4 0 0 0 P6 P7 0 0 2 1 3 0 0 0 P8 0 0 1 1 2 0 0 0 Service credit matrix PeerRank - Reputation Ranking For illustration, we assume that a node will receive 2 credits for providing content and 1 credit for transporting content. Assume there are 9 transactions: Computed reputation rankings II. Freeloader Problem – Main Ideas
An Approach to Solving the Freeloader’s Problem Admission System: An Overview When node X receives a request from node Y for content, X triggers a series of steps: • X grants the request with a probability based on X’s current willingness-to-serve parameter r • If granted, X determines whether Y should be “admitted” or “denied” by figuring out Y’s service and usage reputation ranking from a set of sampling nodes • If admitted, X sends Y the requested content and uses a third-party node to record X’s credits for trust-enhancement purposes II. Freeloader Problem – Main Ideas
Admission System: Main Ideas • A reputation-based, differentiated admission control that allows a node to receive a level of service based on its service and usage reputations in the past • An eigenvector-based method that derives service and usage reputations of nodes by computing the largest eigenvalue / eigenvector pairs of the credit matrix associated with past transactions • An adaptation system that allows a node to take care of its own interest by contributing resources at a level just sufficient for its desired level of service • A sampling technique that uses top service and usage nodes, as well as benchmark nodes, to reduce the cost of computing service and usage reputations of nodes • A distributed trust-enhancement scheme that uses third-party nodes to manage and store credits required by reputation computations II. Freeloader Problem – Main Ideas
rold rnew A r1 r2 r1 r2 r3 r4 a21r1 r1’ r2 ’ r3 ’ r4 ’ a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34 a41 a42 a43 a44 a12r2 a24r4 = a23r3 a13r3 a14r4 a32r2 a31r1 a41r1 a42r2 r1’=a11r1+a12r2+a13r3+a14r4 a34r4 r3 r4 x’=Ax a43r3 Eigen-systems Ax=lx x: eigen-vector l: eigen-value II. Freeloader Problem – Main Ideas
Idea 1: Eigenvector-based reputation system Computing Reputations with Eigenvectors Notations S: service credit matrix vector s: servicereputations U = ST:usage credit matrix vector u: usagereputations An iterative method of computing s and u: Node X’s service reputation s(i+1) = Su(i)and u(i) = Us(i)That is, s(i+1) = SUs(i)ors(i+1) = SSTs(i) • Note that we can view the latter iteration as the power method of computing the largest eigenvalue/eigenvector pair of SST • This implies that s is the eigenvector of SST corresponding to the largest eigenvalue of SST. Similarly, u is the eigenvector of STS corresponding to the largest eigenvalue of STS The matrix formulism here parallels to that used in ranking web pages (ref: Kleinberg HITS algorithm) III. System Design
Idea 2: Differentiatedadmission control Reputation-based Admission • Transactions → Credit Matrices → Reputations → Rankings • After an allowed transaction is complete, the participating nodes will update their credits to reflect their roles in the transaction. Thus, the service and usage credit matrices will change accordingly. These changes will in turn affect future reputation rankings of the nodes • The reputation ranking of nodes is comparative • We use s or u to denote both reputation and ranking depending on the context • Increasing the ranking of a node implies decreasing the rankings of some other nodes • A request from node Y will be denied by node X if Y’s usage reputation u is above A% while its service reputation s is below B% • A and B are certain preconfigured thresholds with A>B(For example, A and B can be 80 and 20, respectively) • While being denied of receiving content, Y can still continue providing content and transport services, thereby improving its service reputation • A freeloader which has a low service reputation (s < B%) and a high usage reputation (u > A%) will be denied of service • The freeloader will need to either provide an increased level of service (to increase s) or reduce its content usage (to decrease u) in order to be readmitted again
willingness-to-server desired service level C success rate s Idea 3: Adaptation system Nodes’ Adaptation in Willingness to Serve Goal: search for a minimum level of contribution a node needs to provide in order to receive a desired level of service from other nodes • willingness-to-serve r: a parameter determining the probability at which the node will grant arriving service requests • desired service level C: a desired level of service. C is constrained by the system-wide parameters A and B: 1 - B*(1-A) • observed success rates: the percentage of its content or transport requests that are granted by requested nodes Intuitively, when a node finds that its success rate is above a desired service level C, it will decrease its willingness-to-serve parameter r. On the other hand, when the node finds that its is below C, it will increase its r
Decrease r Increase r u>A & s<B Admit& NotMore Deny (uA or sB) & ≥C u>A & s<B ≥C <C (uA or sB) & <C Admit& More Increase r Nodes’ Finite State Machine for the Adaptation of Willingness to Serve A node increases its willingness-to-serve parameter r → This will improve its service reputation ranking s and cause some of the other nodes to drop their service reputation rankings or raise their usage reputation rankings → Some of these nodes may then enter the “Deny” state → These nodes will then increase their r thereby allowing the current node to improve its success rate “poor man's equilibrium” problem: if the two admit states were combined, many nodes could be in this combined “Admit” state with small s and very low r. These nodes would never be able to increase their r since they cannot enter the “Deny” state due to their u being below A%
Decrease r Increase r u>A & s<B Admit& NotMore Deny (uA or sB) & ≥C u>A & s<B ≥C <C (uA or sB) & <C Admit& More Increase r Newcomer Issue For a new node, its initial state is configured to be “Admit & More” and its initial r is 0. That means it can receive services without providing services initially. • If a new node makes many service requests such that its u is above A% and its s is below B%, it will enter the “Deny” state • As long as its success rate s does not reach its desired level of service C, it will stay at the “Admit & More” state and increase its r to raise s • When its s reaches C, it will enter the “Admit & Not More” state and try to reduce its r
System Bootstrap • Initially there have not been any transactions. Thus: • the service or usage credit matrix is initially a zero matrix • the success rate s of every node is zero • Before the success rate s of a node reaches C,it will increase its r to raise s • That means each node will try to obtain its desired level of service by providing more services
.8 .4 Desired Level of Service C Success Rate Willingness 1 0 1 0 Round # Convergence of Nodes’ Adaptation:A Simulation Result • A = .8, B = .2, C = .4 or .8, and # transactions per round = 10,000 • When in the “Admit & Not More” state, a node decreases r using r← .95 * r, else it increases r using r ← max (r + .05, 1).Except when r = 1, the decreasing rate is smaller than the increasing rate • The simulation results above show that • s generally tracks changes in C as the node will adapt its r value • A new node will be able to adapt its r to achieve its desired level of service C within about 16 rounds
usage reputation higher than PAand service reputation lower than PB Decrease ρ Increase ρ Admit& NotMore Deny • usage reputation not higher than PA • or service reputation not lower than PB • ≥ C ≥ C < C Admit& More • usage reputation not higher than PA • or service reputation not lower than PB • < C Increase ρ Idea 4: Sampling technique Sampling Heuristic: Use of Benchmark Nodes To determine the current state of a node X, we will only compare X to two benchmark nodes: Node PA has its usage-reputation ranking percentile at A% and Node PB has its service-reputation ranking percentile at B% Need not know the exact rankings of a node.
A Top-nodes Sampling Heuristic In computing reputation, we will only use a sampling set, consisting of top service and usage nodes, node X, and two benchmark nodes PA and PB Receive a request from node X Form a sampling set Construct the service creditmatrix of the sampling set Compute reputation Compare X to PA and PB Background processes consider all nodes to select top service and usage nodes and benchmark nodesPA and PB
Effectiveness of the Sampling Heuristic: A Simulation Result When transactions exhibit the “hub” phenomenon, it generally suffices to use a small number of top nodes, such as 10, in the sampling set Load: mixtures of multiple Laplace distributions for requesting and requested nodes
Idea 5: Distributed trust-enhancement scheme Distributed Trust-enhancement Using a Third-party Node • For a transaction of sending content from Pi to Pj • The associated service and usage credits are stored at a third-party node Ph, where h = hash (i, j) • We can use a distributed hash table mechanism (DHT) such as Chord to maintain the credit matrix • Generally we can assume that Pi and Pj have no management authority on data stored at third-party nodes
Summary and Concluding Remarks • A distributed admission system with following features: • reputation-based admission control Consider the service and usage reputation rankings of a requesting node as well as the willingness of the requested node • service and usage reputations computed with eigenvectors Consider the authorities of individual nodes • the 3-state model and nodes’ adaptation on r Search for a minimum level of contribution a node needs to provide • use of sampling with top nodes and benchmark nodes Reduce the information needed to determine current state of a node • trust-enhancement with third-party nodes Provide protection against possible cheating of the system • Further research into control strategies on r and various application-specific scenarios would be useful IV. Conclusion