110 likes | 267 Views
SPAM DETECTION IN P2P SYSTEMS. Team Matrix Abhishek Ghag Darshan Kapadia Pratik Singh. AGENDA. REFRESHER SOFTWARE DESIGN PROGRESS DEMO. REFRESHER. Basics of P2P Overview of Paper 1 Overview of Paper 2 Overview of Paper 3 Proposal References. Software Design.
E N D
SPAM DETECTION IN P2P SYSTEMS Team Matrix Abhishek Ghag Darshan Kapadia Pratik Singh
AGENDA REFRESHER SOFTWARE DESIGN PROGRESS DEMO
REFRESHER Basics of P2P Overview of Paper 1 Overview of Paper 2 Overview of Paper 3 Proposal References
Working Of Napster • Centralized Server and a pool of clients. • Clients register themselves. • Server obtains IP Address and list of files. • When clients queries, server returns IP address of peers. • Direct downloading from peer.
Progress • Detailed study of structure and working of Napster systems. • Try to build our own system based on the study. • Studied the algorithm for Spam detection in detail.
Query Processing 1 Client writes a query. 2 Server compares the query with its own files 3 On match server returns System Identifier and Descriptor. 4 The client groups the individual groups by keys. 5 The Client ranks according to some ranking function. 6 The client download the file and becomes the server.
Algorithm for Spam Detection For Type 2 and 3 Spam 5a. Groups are ranked by cosine similarity (or some other query-dependent ranking function).
For Type 1 and 4 Spam 5b. Identify the top-M results as candidate results. 5c. Re-rank the top-M results by either NumUniqueTerms or Jaccard/Cosine distance. The results that are low in the order are more likely to be Type 1 spam than those higher up. 5d. Identify the top-N results, where N < M as the new candidate results. 5e. Re-rank the top-N results by their per-host file replication degree. The results that are low in the order are more likely to be Type 4 spam than those higher up.