120 likes | 212 Views
Spam. Andy Nguyen 5/17/2004. What is Spam?. Unsolicited means that the Recipient has not granted verifiable permission for the message to be sent. Bulk means that the message is sent as part of a larger collection of messages, all having substantively identical content.
E N D
Spam Andy Nguyen 5/17/2004
What is Spam? • Unsolicited means that the Recipient has not granted verifiable permission for the message to be sent. Bulk means that the message is sent as part of a larger collection of messages, all having substantively identical content. • A message is Spam only if it is both Unsolicited and Bulk (UBE) • Unsolicited Email is normal email(examples include first contact enquiries, job enquiries, sales enquiries) • Bulk Email is normal email(examples include subscriber newsletters, discussion lists, information lists) • Technical Definition of “Spam”: • An electronic message is "spam" IF: (1) the recipient's personal identity and context are irrelevant because the message is equally applicable to many other potential recipients; AND (2) the recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent; AND (3) the transmission and reception of the message appears to the recipient to give a disproportionate benefit to the sender. Source: www.spamhaus.org
Effects of Spam • Bandwidth Loss • Connection Expense • Unnecessary disk space usage • Over-flowing user mail boxes • Loss of productivity • Fraud • Costs estimated at $1 Billion/year • Nearly 30% of AOL’s mail is Spam
Spammers • Use automated tools that analyze online content • Methods • Looking through UseNet for email addresses • Mailing lists • Web pages (guest books, forums, etc.) • Dictionary attacks on user and domain names, using predictable email addresses • E-mail directories, white pages (Big Foot) • Chat Rooms
Spam Defense • Types of Defense: • Educational • Technical • Legal/Economical • Issues for Technical Spam Solutions: • Deployment
Blacklisting • Blocking mail from servers that is known to be bad • Can stop e-mail before it is sent out • Uses DNS-based distribution scheme • Issues: • Account Hopping – spammers use free e-mail addresses, spoof e-mail addresses, send through open relays/non-blacklisted servers to hide their point of origin • Should you trust the administrators of these blacklists? • blacklist listing policies differ • Compromised blacklist can blacklist the internet (0/0), or allow everyone through • New/unknown mail servers? Also may prevent good mail from coming through
Spam Poisoning • Defense against e-mail harvesting • Instead of user@example.com, use user@exampleREMOVETHIS.com • Using images • Generating fake web pages, with fake addresses • Issues: • Once address is revealed, all effort spent concealing address wasted • Harvesters use search engines to find email addresses
Distributed, Collaborative Filtering • When a system receives spam, either from a user or “spam trap”, message is hashed and passed to closest server • This mechanism maintains a distributed and constantly updating library of bulk mail • Issues: • Users can abuse service and submit legitimate email • Spammers randomize their spam to change checksums (adding random strings etc.)
Content Filtering • Destination based defense • Based on the content of the message • Bayesian Approach • Issues: • Processing load on mail server • Doesn’t address bandwidth and storage issues • Accuracy isn’t 100%? Is this acceptable? • Spammers may run their e-mails through the filters in order to bypass them • Privacy issues
Pricing Functions • Basic Idea: • “If I don’t know you and want you to send me a message, then you must prove that you spent, say, ten seconds of CPU time, just for me and just for this message” • Proof of effort takes some time to compute but easily verifiable • Function based on large number of scattered number of memory accesses • Issues: • What about legitimate mailing lists? • Attackers could just compromise many machines to send out the mail (similar to DDos) • Where would you deploy this ? On the between sender and mail server, server-server?
Internet Mail 2000 • New mailing protocol • Changes “push” architecture to a “pull” architecture • Mail stored on sender’s server • Issues: • New attacks are possible • Global deployment would be required
Discussion • Certified E-mail? • National opt-out list? • Human Skill-Challenges ? • Payment methods (charging a small fee when sending e-mail) • Possible legislation • Which approach do you think is best? Or should we use a combination of mechanisms?