60 likes | 193 Views
Detecting Web Spam with CombinedRank. Abhita Chugh Ravi Tiruvury. Motivation. TrustRank Starts with a seed set of good pages Propagates trust to each page reachable by the trusted seed set. Drawback: Can assign a high “Trust” score to a spam page!. 0. Anti-Trust Rank
E N D
Detecting Web Spam with CombinedRank Abhita Chugh Ravi Tiruvury
Motivation • TrustRank • Starts with a seed set of good pages • Propagates trust to each page reachable by the trusted seed set. • Drawback: Can assign a high “Trust” score to a spam page! 0 • Anti-Trust Rank • Starts with a seed set of spam pages • Propagates distrust to each page reachable by the spam seed set in the inverse webgraph. • Benefit: Assigns high “Anti-Trust” scores to spam pages 1 0.18 0.12 2 3 0.15 4 0.13 0.05 x.yz TrustRank score 5 6 good 0.05 7 bad
CombinedRank • Combines TrustRank and Anti-Trust Rank • Each node in the webgraph has a Trust score and an Anti-Trust score • High Trust Score & High Anti-Trust Score => Potentially Spam! • Best results with α = 1.0 and β = 0.8 CombinedRank = α * (TrustRank) – β * (Anti-Trust Rank)