90 likes | 279 Views
Google, we’ve got a problem. Elizeu Santos-Neto. Spam. Multiple variants E-mail, web spam, link spam, tag spam, RSS feed spam, blog spam , etc Blogs are an easy target and tool How ? A spam blog (example) Comment spam (example). What are the effects?. Search ranking manipulation
E N D
Google, we’ve got a problem Elizeu Santos-Neto ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
Spam • Multiple variants • E-mail, web spam, link spam, tag spam, RSS feed spam, blog spam, etc • Blogs are an easy target and tool • How ? • A spam blog (example) • Comment spam (example) ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
What are the effects? • Search ranking manipulation • Link farms • Keyword spoofing • User frustration: survey (Schroeder et al.) • 25% have seen colleagues kicking their computers • 2% confess to have hit the person next to them ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
How to tame spam? • Content analysis • nofollow attribute • Spam-proof ranking strategies • “Report Spam” buttons • Hybrid solutions ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
Google, we’ve got a problem! • http://googlecustomsearch.blogspot.com/ • “Unusual” posts appeared • Design was completely changed • Several spam links and comments ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
What did it happen? • Hypothesis: operators ignored the messages about spam detection. • How does the Blogger spam detection works? (intuition) ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
Spam Detector Blogs Blog Owner Where is my blog? ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
Conclusions and Final Comments • Even Google is not immune to operator failures • Also, the mechanism seems to make a wrong assumptions about the speed of operators feedback • Spam handling turnaround time should be proportional to the volume of visitors? • Prefixed trust set of blogs? ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan
References • Schroeder et al. • Collecting, Analysing, and Exploiting Failure Data from Real, Large Systems. Google Tech Talks, October, 2006. • Spam Blog: • http://raulypennington2006.blogspot.com/2007/09/hard-money-mortgage-california-ca.html • Spam Comment: • simply a link to a spam web page in the comments • NetworkWorld.com: • http://www.networkworld.com/news/2007/080807-google-mistakes-own-blog-for.html • Risks Digest • http://catless.ncl.ac.uk/Risks/24.80.html#subj4 ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan