380 likes | 680 Views
Anti-spammer response: follow redirector. Spammer response: use Geocities with complex ... the hypothesis that people are better at spam filtering than machines ...
E N D
Three years of spam mutation John Graham-Cumming Independent Consultant France
Who am I? • Author of POPFile • Created in August 2002 • Machine sorting of email spam, ham, you-name-it • Jolt Productivity Award in 2005 • Founding speaker at MIT Spam Conference • 2003: “The Spammers’ Compendium” • 2004: “How to beat an adaptive spam filter” • 2005: “People and Spam” • Frequent speaker, writer, coder on anti-spam
Why spam? • Spam found me • Started investigating machine classification of email in 2000 • All my email has been sorted automatically for the last 6 years • Spam presents a problem: it changes • Spammers are adversarial • They react (very fast: hours to days) to changes in spam filtering technology
Are spammers stupid? • I don’t think so • And I hope to convince you in this talk • Spammers innovate constantly • The products they sell • How they send the spam (fixed IPs, broken web forms, open proxies, zombie networks, …) • The content of their messages • This talk focuses on the content of spam messages
Once upon a time… • It was fairly easy to filter spam • It came from fixed IP addresses: use a blacklist • The From address was not forged: easy to filter • The spam contained keywords that could be blacklisted • penis, viagra, etc. • Simplistic filtering on From and content is now useless (has been for a few years) • Only complex algorithms can filter on spam content • Hashing, weighted heuristics, ‘Bayes’, …
In 2003 POPFile worked great… • … except for spam • Because spammers were trying to avoid filters by all sorts of trickery • Some tricks you are familiar with… • VIAGRA has become V1@GRA • PENIS has become PE.NIS • ENLARGER is written ENARELGR • Most others are hidden from the reader
The Spammers’ Compendium • www.jgc.org/tsc • Informal collection of spammer tricks • All about content of spam, not how it’s sent • Been collecting since 2003 • Now a total of 55 tricks • All have been seen in the wild • All have humorous(*) names • (*) OK, I thought they were funny :-)
Fun facts • 80% of spam uses HTML • Colors, images, fonts… • … and tricks • 80% of spam uses at least one content trick • Invisible Ink, Camouflage, … • Spam and spam filters are in an arms race • Irony: Spammer tricks often make spam easier to filter • Who spells Viagra V1@GGR@?
The Spam Zeitgeist (1) • Spammers realize that spam filters spot their tricks, so they are trying… • Short plain text emails with a URL • Anti-spammer response: URL blacklist • Spammer response: use redirector • Anti-spammer response: follow redirector • Spammer response: use Geocities with complex page that reloads using encoded Javascript
The Spam Zeitgeist (2) • Spammers realize that spam filters read their mail… • Spammer response: send an image instead of text • Anti-spammer response: checksum the images • Spammer response: make random modification of image and number of images • Anti-spammer response: perform OCR on images • Spammer response: add random noise to images
Example: Hypertextus Interruptus • A once popular trick that has fallen out of favor • Use HTML's commenting mechanism to break up bad words • HTML comments are written <!-- comment --> and the entire sequence is ignored and not displayed. • Easy to break up a word like Viagra:V<!-- banana -->i<!-- wumpus -->a<!-- dinosaur -->g<!-- potato -->r<!-- amtrak -->a
Example: Invisible Ink (1) • Hide lots of good words using white font on a white background • Before:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast
Example: Invisible Ink (1) • Use HTML <font color=xyz> tag to make the good words disappear • After:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <font color=white> … </font>
Example: Camouflage (1) • Like Invisible Ink but use slightly different colors (almost white on white) • Before:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast
Example: Camouflage (2) • Like Invisible Ink but use slightly different colors (almost white on white) • Add a colored background:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <body bgcolor=#CC9900> … </body>
Example: Camouflage (3) • Like Invisible Ink but use slightly different colors (almost white on white) • Finally, color the text almost the same:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <font color=#BB8811> … </font>
Example: The Matrix • Prevent spam filter from reading the text in a spam by writing it verticallyBuyGenericViagraCheap
Example: The Matrix • Prevent spam filter from reading the text in a spam by writing it verticallyB G V Cu e i hy n a e e g a r r p i a c
Example: Catch a Wave (1) • Split a sentence into two lines and then put them back together again • Start with:Increase your sexual desire
Example: Catch a Wave (2) • Split a sentence into two lines and then put them back together again • Make two linesInc se yo se al des rea ur xu ire <tr align=bottom><td></td><td>rea</td> … <td>ire</td></tr>
Example: Catch a Wave (3) • Split a sentence into two lines and then put them back together again • Back together again:Increase your sexual desire <tr align=bottom><td rowspan=2>Inc</td><td></td> … <td></td></tr>
Example: The Rake (1) • Break up a word with random letters, then move the letters out of the way • Start with:Viagra
Example: The Rake (2) • Break up a word with random letters, then move the letters out of the way • Sprinkle in some random letters:Vxifatgyrka
Example: The Rake (3) • Break up a word with random letters, then move the letters out of the way • Mark each letter to be moved out of the way:Vxifatgyrka <span style=“float:right”>x</float>
Example: The Rake (4) • Break up a word with random letters, then move the letters out of the way • The end result:Viagra xftyk
Example: Whiter shade of pale (1) • Concatenate words using random greyed out letters • Start with:Offshore pharmacy online now
Example: Whiter shade of pale (2) • Concatenate words using random greyed out letters • Add random letters between words:OffshoreGpharmacyUonlineInow
Example: Whiter shade of pale (2) • Concatenate words using random greyed out letters • Grey or white out the letters:OffshoreGpharmacyUonlineInow <font color=lightgrey>G</font>
Image Spam • Spammers currently like image spam because… • Text based filters are getting really good • Hard to read the text in an image • URL blacklists are catching a lot of spam • Recipient is usually asked to type in a URL in the image • Hard to automatically extract that URL for blacklisting • An image is hard for a machine to interpret • Lots of latitude for obscuring the image
Will it end? • No • People buy from spam • Pew Internet Trust 2003 Survey: 7% • My 2004 Survey: 1% • But a 0.001% response rate is break even • End users are seeing less and less spam • Spam has moved from an end-user problem to a sysadmin problem
Resources • www.jgc.org • All my writing on spam (and other topics) • Subscribe to my newsletter • The Spammers’ Compendium • Anti-spam Tool League Table • www.extravalent.com • My commercial web site (polymail) • Need an anti-spam OEM engine? • Now you know who to ask…
Fun Stuff: SpamOrHam.org • Testing the hypothesis that people are better at spam filtering than machines • Bill Yerazunis: 99.84% • My 2004 Survey: 99.46% • Read randomly selected spam or Enron emails and click Spam, Ham or I’m not sure • www.spamorham.org • You could even win an ‘enlarger’!
Thank you • Thank you for listening • Questions?