1 / 40

Can Spam

A presentation by… Bodek Frak & Craig Brown. Can Spam. - OUCC 2004 -. What is Spam?. Unsolicited: You did not ask for it Commercial: Trying to sell you something (legal/illegal) Bulk: Sent in large quantities UCE: Unsolicited commercial email

josef
Download Presentation

Can Spam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A presentation by… Bodek Frak & Craig Brown Can Spam - OUCC 2004 -

  2. What is Spam? • Unsolicited: • You did not ask for it • Commercial: • Trying to sell you something (legal/illegal) • Bulk: • Sent in large quantities • UCE: Unsolicited commercial email • UBE: Unsolicited bulk email

  3. What is NOT Spam? • E-mails from mailing lists that a person subscribed to and does not know how to unsubscribe. • E-mails from on-line shopping companies where person was shopping. • E-mails generated by viruses mailing themselves as attachments • “Chain letters” sent to you by your friends or people you barely know….

  4. False Positive / Negative • False Positives: legitimate e-mails that get mistakenly identified as spam. • False Negatives: spam e-mails that slip undetected past the filters. • “For most users, missing legitimate e-mail is an order of magnitude worse than receiving spam, so a filter that yields false positives is like an acne cure that carries a risk of death to the patient.”

  5. Why? • Not regulated as much as telemarketing or paper mail. • Anonymous nature of e-mail (SMTP limitations) and Internet. • Cheap: The sheer number of spam mail sent means that even tiny response rates, reportedly 0.0001%, means junk mailers turn a profit. • Number of people using e-mail is rising.

  6. How? • Bots searching web sites and newsgroup postings for anything with “@” sign. • Directory harvest attacks. • Some dishonest companies sell their customer data. • “Unsubscribe link” • Guessing: peter@yahoo.com, peter@hotmail.com, peters@uwindsor.ca, etc. • Viruses and worms stealing addresses from PC’s address book.

  7. Implications • Storage and bandwidth costs • Administrative overhead • E-mail delivery delays • E-mail server performance decreased • Lost employee productivity • Frustration (keyboards/monitors) 

  8. Frustration

  9. The Battle • Legal initiatives (new laws, lawsuits)Problem: most spam comes from outside N. Am • More effective and widespread spam filters (spam business model no longer profitable)Problem: high cost, complexity • Next generation (“not-so-simple”) SMTP Problem: will take time before widely adopted • Better user awareness

  10. Approaches • Outsource: Redirect all your incoming mail through a company that filters out the spam and delivers good mail to your users • In-house: Hardware/software spam control solution deployed at the network level (server-side) or client level • Plan B: Switch to paper and pen …. 

  11. Two Extremes • Stop spam before it reaches the user • All mail is suspect and scrutinized by the gateway/server for point of origin and content; spam is discarded. • Pros: • - Avoids wasting user’s time • - Saves internal network bandwidth and server storage, processing power • Cons: • - False positives never detected (high impact) • - User not in control

  12. Two Extremes • Let everything in • Point of origin and content not scrutinized but users are given tools to deal with unwanted mail • Pros: • - User makes his own spam decisions • - Minimal impact of false positives • Cons: • - Lost bandwidth, storage, processing power • - User learning curve

  13. Spam Detection • SMTP Level Rules: - DNS Lookup (connecting host, return address) - RBL: DNS real-time blackhole lists (controversial) • Content Checking Rules (score): • - SMTP headers • - Content (subject line & body) • Whitelists/Blacklists • Challenge-Response

  14. Bayesian Filtering (Probability) • Pros: • Widely acknowledged to be the best way to catch spam • Learns all the time, and takes into account your valid e-mails (known as “ham”) • Takes a statistical approach and does not rely solely on static updates from the vendor • Cons: • My spam may not be the same as your spam • Takes time to build the spam/ham database • Adds complexity for the user

  15. Two Strategies • All external e-mails are legitimate except those from blacklisted (blocked) addresses and/or domains and those that do not pass the rules (more common) • All external e-mails are spam except those from whitelisted (protected) addresses and/or domains (more effective)

  16. Best Practices • Never make a purchase from an unsolicited e-mail • If you do not know the sender of an unsolicited e-mail message, delete it. • Never respond to any spam messages or click on any links in the message. • Avoid using the preview functionality of your e-mail client software.

  17. Best Practices • When sending e-mail messages to a large number of recipients, use the blind copy (BCC) field to conceal their e-mail addresses • Never provide your e-mail address on websites, newsgroup lists or other online forums. • Never give your primary e-mail address to anyone or any site you don’t trust. • Have and use one or two secondary e-mail addresses.

  18. Resources • Sophos “Field Guide To Spam” • http://www.sophos.com/spaminfo/explained/fieldguide.html • SpamNews http://www.petemoss.com • Inventor of Bayesian Filteringhttp://www.paulgraham.com • Bayesian Spam Filteringhttp://email.about.com/cs/bayesianfilters

  19. The University of Windsor Problem • Currently receiving over 200,000 e-mail messages per day • In-house filters deleting up to 100,000 e- mails per day • Many more spam messages getting through • Our end users are frustrated!

  20. Spam Solutions • We looked at 3 categories of software solutions • - E-mail server add-on • - SMTP gateway products • - Client side products • Hardware based solutions • - Complete solution that consists of both hardware and software in one package

  21. E-Mail Server Add-On • A solution implemented on the e-mail server • Pros: • Works in conjunction with e-mail server • Vendor specific • Cons: • Vendor specific • - The University of Windsor would need at least two solutions, one for faculty/staff, the other for students

  22. SMTP-Based Gateway Products • An independent solution usually in front ofe-mail servers • Pros: • Vendor independent • Identifies spam before it hits the e-mail servers • Reduces load on e-mail servers • Cons: • Usually more expensive, in some cases requires purchase of new hardware

  23. Client Side Products • Installed on the user’s PC • Works with e-mail client to filter/tag suspected spam when messages are being downloaded from the mail server. • Pros: • Provides individual with complete control over spam filtering • Cons: • Possible deployment / support issues

  24. Our Requirements • Minimal involvement of IT staff in maintaining the solution • Significantly higher rate of catching spam when compared to our current “in-house” solution - Currently identifying between 10-30% of total e-mail volume as spam • More advanced set of features and options “out of the box” • Technical support from vendor, including upgrades/updates • Ability for end users to control how their spam is dealt with • LDAP Compliant

  25. Essential Spam Detection Techniques • Keyword search and proximity search in subject/body • Keyword search using pattern matching or heuristic analysis • Message format analysis • Statistical Analysis (Including Bayesian filtering) • Blacklist of known bad e-mail and IP addresses • Whitelist • Open proxy lists, DNS verification • Ability to filter viral attachments (not essential!)

  26. Spam Engine Settings • Settings can be changed system-wide • Settings can be changed for groups of users • Settings can be changed for individual users

  27. Actions on Spam • Delete the message • Quarantined in either a per-system quarantine, or a per-user quarantine • Tagged within the message header • Tagged by adding something to the subject line

  28. Updates to the Software • User or administrator updates by editing rules • User or administrator trains spam engine through human identification of spam • Vendor keeps engine updated through periodic updates

  29. Product Selection • Using the criteria developed by the committee, a list of products that fit the criteria was established. • These products were identified through magazine articles, newsletters, other campus solutions, and previous vendor contacts

  30. Product Selection • A number of products were eliminated from the list • Reasons for elimination included: • - Operating System (Our in-house expertise is with UNIX/Sendmail) • - Scalability • - Lack of essential features

  31. Finally – A Shortlist! • Spam Assassin • BrightMail • Can-It Pro • PureMessage

  32. Decisions, Decisions … • The committee reviewed the short-listed products, viewed presentations, and tested two solutions • The solutions tested were: • - Can-It Pro - PureMessage

  33. Live Evaluation • Each product installed on a test server • Members of committee had e-mail directed to test server • Can-It Pro • - Very low false-positive rate • - Acceptable spam capture rate out of the box • PureMessage • - High spam capture rate out of the box • - Very low false-positive rate

  34. The Final Decision • PureMessage was chosen to be our campus wide spam solution • High spam capture rate • Low positive rate • Low maintenance • Offers all the features we required • Fit our budget

  35. Implementation • Solution will run on four Sun Sunfire V440 Servers • 2 servers for incoming mail, 2 servers for outgoing mail, 1 server for spooling • Currently waiting for server hardware to arrive • Current implementation target: Summer 2004 • Support staff have been added to the current pilot to test

  36. Default Configuration • Default configuration will be “Quarantine” • Other options include “Tag & Deliver” and “Opt-Out”

  37. Default Configuration • True spam (messages > 90%) will be deleted • Possible spam (scoring 50-90%) will be quarantined • Legitimate (messages < 50%) will be delivered • A large amount of messages will never reach the messaging servers • Daily digest will be mailed each morning, detailing messages in quarantine from the previous day • End users will have on-demand access to their quarantine through a web-based interface

  38. Default Configuration • In addition to the PureMessage AntiSpam solution, we also purchased the PureMessage Policy Manager • This will allow us to filter out unwanted attachments that may be viral – EXE, COM, BAT, PIF, SCR, etc.

  39. Changes to E-Mail Handling • In addition to spam filtering, other initiatives have been proposed to curb the amount of spam on campus. These include: • - Only accepting mail addressed to valid addresses • - Only accepting mail with valid return addresses (domain must resolve in DNS)

  40. Implementation Issues • Some users on other non-ITS controlled servers may not be in our LDAP database • Possible issues with the handling of ListServs • Possible issues with Shared Mailboxes

More Related