190 likes | 335 Views
Fighting Freeloaders in Decentralized P2P File Sharing Systems. Ivan Osipkov. Gnutella/KazAa: 70% freeloaders, 1% serve 50% requests, files satisfy Zipf distribution Need to spread files according to popularity and curb freeloading, in order to provide QoS and eliminate central nodes
E N D
Fighting Freeloaders in Decentralized P2P File Sharing Systems Ivan Osipkov
Gnutella/KazAa: 70% freeloaders, 1% serve 50% requests, files satisfy Zipf distribution Need to spread files according to popularity and curb freeloading, in order to provide QoS and eliminate central nodes Properties sought in a solution: Minimize polling Distributed solution Collusion resistant Sybil attack resistant Introduction
Average user is online for 1 hour User identity may change or several identities present at the same time Evolutionary Prisoner’s Dilemma Sims: User Behavior in Gnutella • New users should be treated with distrust but they should be able to start quickly • A user needs to have some info about other peer’s interactions. Otherwise, the game will not scale as population rises and non-optimal strategies will be followed.
Polling: Our Goals: Witnesses Proposed Solutions • Overhead • Incomplete Picture • DoS • Who’ll participate? • Overhead • Who’ll participate? • No polling • Avoid public service
Contributions • Off-line participation evaluation. No 3rd party brokers or polling: only PKI assumed • Upload activity proportional to download activity. Popular files spread more, thus improved load-balance • Progressive taxation on accumulated credit. Easy to start participating • Distributed data for undeniable collusion detection • Works on top of Gnutella-like protocol and can be integrated with it
Offline Participation Evaluation • If peer A sends file F to peer B, A obtains a signed download receipt from B, and B obtains upload receipt. • The receipts state the time-interval of the file transfer and IDs of peers involved. • Each peer presents its receipts when requesting a download • Credit of peer is calculated based on the receipts and the peer is put into a priority queue {A,B,dld,T1,T2}sign(A) F A B A B {A,B,upld,T1,T2}sign(B)
Credit Calculation • Given a time frame (s,t) and current time T, credit contribution of this time-frame is f(s,t,T) where f "ages" as T increases. • One adds all contributions of upload receipts, subtracts contributions of download receipts and of "unaccounted time-intervals“ • Old receipts contribute little
Properties of Credits • Old receipts can be discarded • Dumping of download receipts leads to "unaccounted time-frames" for which peer may be charged even more. • The more credit peer has, the more it loses to aging. Thus downloads cost more and uploads generate less credit: PROGRESSIVE TAXATION • As a consequence of the formulas, credit has upper and lower limits.
General Bandwidth • Let b Kb/sec be "unit" bandwidth. A user with n*b bandiwdth creates n virtual (but related) peers. Each virtual peer can be involved in a single transaction at a time which should be finished without interruptions. • If peer A has n*b bandwidth and B has m*b where n<b, then n*b can be dedicated on each side for download, but accounting is done on each virtual peer separately.
New Users • New users get more credit from uploads than the old users. Thus if it first provides uploads to others it will be able quickly to obtain a viable amount of credit. • Need to have some (popular) files initially • Dynamic vs potential credit • Should initial credit be given?
Collusion Attacks • If undetectable from receipts then they raise total credit insignificantly • If detectable then proof is undeniable: use intersection of receipts • Data needed for proof is distributed but within similar time-frame • Peers holding the compromising receipts are interested in prosecution
Collusion Attacks (Cont’d) (colluder) A A A B T1 T2 T3 T4 T5 TIME B (colluder) C B C B: lousy service and no receipt from A A and B can’t dld/upld to others during this time B finally downloads
Collusion Attacks: Conclusion • In the beginning, it takes B three rounds to download, I.e. bandwidth usage is 33% (at most) • Continuous collusion will force bandwidth usage to 25% • What if A says “listen B, I’ll download your files for you, just keep giving me receipts”? • Simulate longer downloads?
Sybil Attacks • Not alleviated by the above mechanisms • Need to do clustering (e.g. using IP addresses which should be pinged) when calculating total credit. • Or use central CA
Simulations Setup • 200 peers • Files have Zipf popularity distribution • Topology is random with average degree 3.4 • Every user has initial files chosen at random but with popularity in mind • 4 Levels of priority queue based on credit • Gnutella routing/discovery simulated • 65% freeloaders • 25% honest peers • 5% servers • 5% newcomers
BitTorrent? • You give to me, I give to you… NOW! • The 2 peers need to have what each one needs (usually part of the same file) • Used to offload work from the originator and increase bandwidth for large files • Applicable to our case?
Future work • Reputation system is a must: need to take action on misbehavior. It's orthogonal to the "economic" one. Reputations should be used when calculating credit similar approach can be used • Receipts should be inspected (lazily?) for violations. This data is distributed and peers holding it are interested in detecting misbehavior. Receipts exposing violations have similar time-frames. • Analysis and simulations of collusions is needed • File-spreading simulations