1 / 38

Three years of spam mutation

Anti-spammer response: follow redirector. Spammer response: use Geocities with complex ... the hypothesis that people are better at spam filtering than machines ...

Kelvin_Ajay
Download Presentation

Three years of spam mutation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Three years of spam mutation John Graham-Cumming Independent Consultant France

  2. Who am I? • Author of POPFile • Created in August 2002 • Machine sorting of email spam, ham, you-name-it • Jolt Productivity Award in 2005 • Founding speaker at MIT Spam Conference • 2003: “The Spammers’ Compendium” • 2004: “How to beat an adaptive spam filter” • 2005: “People and Spam” • Frequent speaker, writer, coder on anti-spam

  3. Why spam? • Spam found me • Started investigating machine classification of email in 2000 • All my email has been sorted automatically for the last 6 years • Spam presents a problem: it changes • Spammers are adversarial • They react (very fast: hours to days) to changes in spam filtering technology

  4. Are spammers stupid? • I don’t think so • And I hope to convince you in this talk • Spammers innovate constantly • The products they sell • How they send the spam (fixed IPs, broken web forms, open proxies, zombie networks, …) • The content of their messages • This talk focuses on the content of spam messages

  5. Once upon a time… • It was fairly easy to filter spam • It came from fixed IP addresses: use a blacklist • The From address was not forged: easy to filter • The spam contained keywords that could be blacklisted • penis, viagra, etc. • Simplistic filtering on From and content is now useless (has been for a few years) • Only complex algorithms can filter on spam content • Hashing, weighted heuristics, ‘Bayes’, …

  6. In 2003 POPFile worked great… • … except for spam • Because spammers were trying to avoid filters by all sorts of trickery • Some tricks you are familiar with… • VIAGRA has become V1@GRA • PENIS has become PE.NIS • ENLARGER is written ENARELGR • Most others are hidden from the reader

  7. The Spammers’ Compendium • www.jgc.org/tsc • Informal collection of spammer tricks • All about content of spam, not how it’s sent • Been collecting since 2003 • Now a total of 55 tricks • All have been seen in the wild • All have humorous(*) names • (*) OK, I thought they were funny :-)

  8. Fun facts • 80% of spam uses HTML • Colors, images, fonts… • … and tricks • 80% of spam uses at least one content trick • Invisible Ink, Camouflage, … • Spam and spam filters are in an arms race • Irony: Spammer tricks often make spam easier to filter • Who spells Viagra V1@GGR@?

  9. Growth of Known Spammer Tricks

  10. The Spam Zeitgeist (1) • Spammers realize that spam filters spot their tricks, so they are trying… • Short plain text emails with a URL • Anti-spammer response: URL blacklist • Spammer response: use redirector • Anti-spammer response: follow redirector • Spammer response: use Geocities with complex page that reloads using encoded Javascript

  11. The Spam Zeitgeist (2) • Spammers realize that spam filters read their mail… • Spammer response: send an image instead of text • Anti-spammer response: checksum the images • Spammer response: make random modification of image and number of images • Anti-spammer response: perform OCR on images • Spammer response: add random noise to images

  12. Example: Hypertextus Interruptus • A once popular trick that has fallen out of favor • Use HTML's commenting mechanism to break up bad words • HTML comments are written <!-- comment --> and the entire sequence is ignored and not displayed. • Easy to break up a word like Viagra:V<!-- banana -->i<!-- wumpus -->a<!-- dinosaur -->g<!-- potato -->r<!-- amtrak -->a

  13. Example: Invisible Ink (1) • Hide lots of good words using white font on a white background • Before:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast

  14. Example: Invisible Ink (1) • Use HTML <font color=xyz> tag to make the good words disappear • After:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <font color=white> … </font>

  15. Example: Camouflage (1) • Like Invisible Ink but use slightly different colors (almost white on white) • Before:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast

  16. Example: Camouflage (2) • Like Invisible Ink but use slightly different colors (almost white on white) • Add a colored background:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <body bgcolor=#CC9900> … </body>

  17. Example: Camouflage (3) • Like Invisible Ink but use slightly different colors (almost white on white) • Finally, color the text almost the same:Buy Viagra Now!Please see the attached spread sheet for our current sales forecast <font color=#BB8811> … </font>

  18. Example: The Matrix • Prevent spam filter from reading the text in a spam by writing it verticallyBuyGenericViagraCheap

  19. Example: The Matrix • Prevent spam filter from reading the text in a spam by writing it verticallyB G V Cu e i hy n a e e g a r r p i a c

  20. Example: Catch a Wave (1) • Split a sentence into two lines and then put them back together again • Start with:Increase your sexual desire

  21. Example: Catch a Wave (2) • Split a sentence into two lines and then put them back together again • Make two linesInc se yo se al des rea ur xu ire <tr align=bottom><td></td><td>rea</td> … <td>ire</td></tr>

  22. Example: Catch a Wave (3) • Split a sentence into two lines and then put them back together again • Back together again:Increase your sexual desire <tr align=bottom><td rowspan=2>Inc</td><td></td> … <td></td></tr>

  23. Example: The Rake (1) • Break up a word with random letters, then move the letters out of the way • Start with:Viagra

  24. Example: The Rake (2) • Break up a word with random letters, then move the letters out of the way • Sprinkle in some random letters:Vxifatgyrka

  25. Example: The Rake (3) • Break up a word with random letters, then move the letters out of the way • Mark each letter to be moved out of the way:Vxifatgyrka <span style=“float:right”>x</float>

  26. Example: The Rake (4) • Break up a word with random letters, then move the letters out of the way • The end result:Viagra xftyk

  27. Example: Whiter shade of pale (1) • Concatenate words using random greyed out letters • Start with:Offshore pharmacy online now

  28. Example: Whiter shade of pale (2) • Concatenate words using random greyed out letters • Add random letters between words:OffshoreGpharmacyUonlineInow

  29. Example: Whiter shade of pale (2) • Concatenate words using random greyed out letters • Grey or white out the letters:OffshoreGpharmacyUonlineInow <font color=lightgrey>G</font>

  30. Image Spam • Spammers currently like image spam because… • Text based filters are getting really good • Hard to read the text in an image • URL blacklists are catching a lot of spam • Recipient is usually asked to type in a URL in the image • Hard to automatically extract that URL for blacklisting • An image is hard for a machine to interpret • Lots of latitude for obscuring the image

  31. Example: Simple Image Spam

  32. Example: OCR Resistant Image Spam

  33. Example: Chop GUI

  34. Example: Chop GUI

  35. Will it end? • No • People buy from spam • Pew Internet Trust 2003 Survey: 7% • My 2004 Survey: 1% • But a 0.001% response rate is break even • End users are seeing less and less spam • Spam has moved from an end-user problem to a sysadmin problem

  36. Resources • www.jgc.org • All my writing on spam (and other topics) • Subscribe to my newsletter • The Spammers’ Compendium • Anti-spam Tool League Table • www.extravalent.com • My commercial web site (polymail) • Need an anti-spam OEM engine? • Now you know who to ask…

  37. Fun Stuff: SpamOrHam.org • Testing the hypothesis that people are better at spam filtering than machines • Bill Yerazunis: 99.84% • My 2004 Survey: 99.46% • Read randomly selected spam or Enron emails and click Spam, Ham or I’m not sure • www.spamorham.org • You could even win an ‘enlarger’!

  38. Thank you • Thank you for listening • Questions?

More Related