310 likes | 425 Views
A Simulation Study of P2P File Pollution Prevention Mechanisms. Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering National Taiwan University. Outline. Background Problem Methodology Simulation Environment & Results Conclusion.
E N D
A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering National Taiwan University
Outline • Background • Problem • Methodology • Simulation Environment & Results • Conclusion
Overview of P2P file sharing system • P2P file sharing system with search capability • Issue a query with keywords to search for a file A file in system Song title, length, encoding scheme of songA Different versions of songA Mp3, wma,… HashValue Hash function songA
How a user searches for a file Responses for songA Peer1 P2P network Query for songA Randomly choose a source for download
Pollution in file sharing system • Definition of a polluted file • Meta-data description doesn’t match its content! • Current P2P networks are full of polluted files [1] • Unintentional • Intentional • [1] J. Liang, Y. X. R. Kumar, and K. Ross, • “Pollution in p2p file sharing systems,” in Proceedings of IEEE Infocom, 2005
Problem • Pollution in P2P system results in the following problems • Reduce content availability • Increase redundant traffic • There are different anti-pollution mechanisms existing • Which one is better?
Methodology • Simulation study on anti-pollution mechanisms • Extending a P2P simulator [2] • Existing anti-pollution mechanisms • Peer reputation system • Choose a reputable peer to download file • EigenTrust [3] • Object reputation system • Choose a reputable version of a file to download • Credence [4] • Different pollution attacks • User behavior [2] M. Schlosser and S. Kamvar, “Simulating a file-sharing p2p network ,” In Proc.of SemPGRID 2003 [3] S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina, “The eigentrust algorithm for reputation management in p2p networks”, in Proceedings of the Twelfth International World Wide Web Conference, [4] K. Walsh and E. G. Sirer, “Experience with an object reputation system for peer-topeer filesharing”, in Proceedings of Networked System Design and Implementation (NSDI), May 2006.
Peer Reputation System : EigenTrust • Rate a peer by it’s uploading history from the whole system Local reputation (Cij) • Global reputation(Ti) Peeri Peerj Good file Cij=
Peer Reputation System : EigenTrust • Rate a peer by it’s uploading history from the whole system • Choose a reputable peer to download Local reputation (Cij) • Global reputation(Ti) Peeri Peerj T2 T3 Bad file Peer3 Peer2 Cij= C31 C21 A peer will store a list of local reputations C 12 Peer1 Peer 2 T1 =? T1 = C21* T2 + C31*T3 Peer 1 C 14 Peer 4
Object Reputation System : Credence • Calculate an object (file) reputation by weighted votes • After download vote it as clean or polluted Query of song A Vote-gather Query of song A P2 P3 P1 P4 P5
Object Reputation System : Credence • Calculate an object (file) reputation by weighted votes • After download vote it as clean or polluted • Choose a reputable version for download Received Responses of P1 Responses of song A Vote-responses of song A P2 Version1 Votep2 Version1 P3 Votep3 Version2 P1 Votep4 P4 Positive correlation Negative correlation Votep5 P5 random choose a source
Pollution Attacks • Prevalent pollution attacks [5] • Decoy Insertion • Hash Corruption Decoy Insertion Hash Corruption A clean file of SongA • [5] F. Benevenuto, C. Costa, M. Vasconcelos, V. Almeida, J. Almeida, and M. Mowbray, • “Impact of peer incentives on the dissemination of polluted content”, in SAC ’06
User Behavior • Slackness [6] • A period of time between download completion and quality check • Bimodal distribution • Awareness [6] • The probability that a user can correctly recognize a file being polluted • No clear characteristic is observed • high-awareness prob. = 0.8 • low-awareness prob. = 0.2 • [6] U. Lee, M. Choi, J. Cho, M. Y. Sanadidi, and M. Gerla, “Understanding pollution dynamics in p2p file sharing”, • in Proceedings of the 5th International Workshop on Peer-to-Peer Systems (IPTPS’06), 2006
Outline • Background • Problem • Methodology • Simulation Environment & Results • Conclusion
Simulator Description • P2P Query Cycle based simulator • In a cycle, each peer issues one query and repeats downloading until satisfied • Extension • Types of attacks • Decoy Insertion, Hash Corruption • Anti-Pollution mechanisms • EigenTrust, Credence • User behavior • Slackness, awareness
Simulation Setup Table 1. File size distribution of P2P traffic [10] [8] S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina, “The eigentrust algorithm for reputation management in p2p networks”, in Proceedings of the Twelfth International World Wide Web Conference, [9] K. Walsh and E. G. Sirer, “Experience with an object reputation system for peer-topeer filesharing”, in Proceedings of Networked System Design and Implementation (NSDI), May 2006. [10] N. Leibowitz, M. Ripeanu, and A. Wierzbicki, “Deconstructing the Kazaa network”, Internet Applications. WIAPP 2003. Proceedings. The Third IEEE Workshop
Critical Evaluation Parameters Evaluate different anti-pollution mechanisms under the following scenarios
Evaluation metrics • Successful Downloading Rate (per cycle) • Redundant Traffic (per cycle) • Reduced traffic Ratio(compared to randomly selection ) Total successful downloads Reduced redundant traffic by using Mj Total trials of downloads Redundant traffic generated by random selection
Simulation Result • Compare the performance of different anti-pollution mechanisms under different scenarios • EigenTrust • Credence • Random
Successful Downloading Rate Credence is more sensitive to the type of attacks Credence identifies a clean version before download EigenTrsut rates on peers, not the hashvalue Converge after 100 cycles EigenTrust > Credence Credence > EigenTrust Under Decoy-Insertion attack Under Hash-Corruption attack
Observation 1 : User awareness Reasons: 1. Fewer peers share clean files 2. Less peers correctly operate the reputation system Credence EigenTrust
Observation 1 : User awareness Reasons: 1. Fewer peers share clean files 2. Less peers correctly operate the reputation system Credence EigenTrust User awareness is critical on anti-pollution mechanisms
Observation 2 : User slackness Pollution held by a user longer has more chances to be download User slackness has negative effect on Anti-pollution mechanisms
Discussion • User behavior has significant effect on anti-pollution mechanisms • Credence performs better under Decoy Insertion, while Eigentrust performs better under Hash Corruption • Type of attacks can’t be predicted • Suggest a hybrid anti-pollution mechanism
Hybrid Anti-pollution Mechanism Response -list Query for songA P2P network Step2: Select a reputable peer by peer reputation mechanism Step1: Select a reputable version by object reputation mechanism
Successful Downloading Rate Ensure both a reputable version and a source confront different types of attacks Decoy Insertion Hash Corruption
Successful Downloading Rate Ensure both a reputable version and a source confront different types of attacks Hybrid mechanism performs the best under both attacks Decoy Insertion Hash Corruption
Reduced-Traffic Ratio • Hybrid mechanism generate more control traffic • Trade-off between pollution traffic & control traffic The trade-off is worthwhile Decoy Insertion Hash Corruption
Conclusion • Both peer reputation and object reputation system are necessary • User behavior has significant influence on anti-pollution mechanisms