SureMail Notification Overlay for Email Reliability

SureMailNotification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

Outline • Email loss problem • Design philosophy • SureMail design • SureMail robustness to security attacks • SureMail implementation

What is Email Loss? • Email loss : sent email not received • Silent email loss • Loss w/o notification (no bounceback / DSN) • Why? • Aggressive spam filters • 90% corp. emails thrown away (blacklist) • AOL’s strict whitelist rules (must send 100/day) • Bouncebacks contribute to spam • Complex mail architecture upgrades / failures • SMTP reliability is per hop, not end-to-end

How Much Email Loss? • Even loss of 1 email / user / year is bad • If it’s an important email • To really measure loss • Monitor many users’ send & receive habits • Count how many sent emails not received • Count how many bouncebacks received • Difficult to find enough willing participants that email each other across multiple domains

Prior Work • “The State of the Email Address” • Afergan & Beverly, ACM CCR 01.2005 • Rely on bouncebacks; similar to “dictionary” attack • 25% of tested domains send bouncebacks • 1 sender • 0.1% to 5% loss, across 1468 servers, 571 domains • “Email dependability” • Lang, UNSW B.E. thesis 11.2004 • 40 accounts, 16 domains receive emails from 1 sender • Empty body, sequence number as subject • 0.69% silent loss

Our Email Loss Study • Methodology • Controller composes email, sends • Our code for SMTP sending • Outlook for receiving (both inbox & junk mail) • Parse sent and received emails into SQL DB • Match on {sender,receiver,subject,attachment} • Heuristics for parsing bouncebacks • Want • Many sending, receiving accounts • Real email content

Experiment Details • Email accounts • 36 send, 42 receive • Junk filters off if possible • Email subject & body • Enron corpus subset • 1266 emails w/o spam • Email attachment • 70% no attachment • jpg,gif,ppt,doc,pdf,zip,htm • marketing,technical,funny

Email Loss Results

Loss Rates by Account • Loss rate 1.82% to 0.82%

Loss Rates by Attachment • Nothing stands out

Loss Rates by Subject/Body • ~50-250 emails sent per subject • Without 35% case : loss rate 1.82% to 1.79%

Summary of Findings • Email loss rates are high • 1.82% loss • 0.71% conservative silent loss ( 1 / 140 ) • Difficult to disambiguate cause of loss • Difference between domains (filters or servers?) • No difference between mailboxes • No difference between attachments • Only 1 body had abnormally high loss

We Found Email Loss; Now What? • Can try to fix email architecture, but • Hard to know exactly what is problem • Spam filters continually evolve; not perfect • Some architectures are very complicated • How many email systems are out there? • The current system mostly works

Fixing the Architecture • Improve email delivery infrastructure • more reliable servers • e.g., cluster-based (Porcupine [Saito ’00]) • server-less systems • e.g., DHT-based (POST [Mislove ’03]) • total switchover might be risky • “Smarter” spam filtering • moving target  mistakes inevitable • non-content-based filtering still needed to cope with spam load

Email Notifications • DSN / bouncebacks • Most spam filters don’t generate DSN on drop • Bogus DSNs due to spam w/ bogus sender • Some MTAs block DSN for privacy • MTA crash may not generate DSN • No DSN for loss between MTA and MUA • MDN / read receipts • Expose private info (when read, when online) • Can help spammers

Notification Design Requirements • Cause minimal MTA/MUA disruption • Cause minimal user disruption • Preserve asynchronous operation • Preserve user privacy • Preserve repudiability • Maintain spam and virus defenses • Minimize traffic overhead

SureMail Design Requirements • Cause minimal MTA/MUA disruption • No MTA modification; no Outlook modification • Cause minimal user disruption • User notified only on loss • Preserve asynchronous operation • Preserve user privacy • Only receiver is notified of loss • Preserve repudiability • No PKI / authentication • Maintain spam and virus defenses • Emails not modified • Minimize traffic overhead • 85 byte notification per email

Basic Operation • Sender S sends email to receiver R • S also posts notification to overlay • R periodically downloads new email • R also downloads notifications from overlay • Notification without matching email  loss • delay : median 26s, mean 276s, max 36.6 hrs

You’ve Lost Mail! H1(Mnew), H1(Mold), T, MAC([T,H1(Mnew)] ,H2(Mold)) GetNotifications Request lost message Register Verify SureMail Overview Recipient R Sender S Dnot=H1(R) Dreg=H2(R)

SureMail Overview • Emails, MTAs, MUAs unmodified • Parallel notification overlay system • Decentralized; limited collusion • Agnostic to actual implementation • end-host-based (e.g., always-on user desktops) • infrastructure-based (e.g., “NX servers”) • Prevent notification snooping & spam • Email based registration • Reply based shared secret

Email-Based Registration • Goal: prevent hijacking of R’s notifications • Only R can receive emails sent to R • Limited collusion among notification nodes • One-time operation for initial registration • R sends registration request to H2(R), H3(R) • H2(R), H3(R) email registration secrets to R • To retrieve notifications at H1(R) • R uses registration secrets with H1(R); H1(R) verifies with H2(R) H3(R), sends back notifications • Neither H1(R), H2(R), H3(R) can associate notifications with R, unless they collude

Reply-Based Shared Secret • Goal: prevent notification spoofing & spam • Only R & S know their email conversations • S rarely converses with spammers • Reply detection • S sends Mold to R, R replies with M’old • S uses H(Mold) to “prove” identity to R in future • Notification for Mnew from S to R • H1(Mnew),H1(Mold),T,MAC([T,H1(Mnew)],H2(Mold)) • Only R can identify S • Shared secret can be continually refreshed

Attacks Defeated by Design • X cannot retrieve H1(R) notifications • H1(R) cannot identify R • H2(R), H3(R) cannot see R’s notifications • If they don’t collude; can increase to 3 nodes • X, H(R) cannot identify S • X, H(R) cannot learn Mnew, Mold • X cannot annoy R with bogus notifications • X cannot masquerade post to H1(R) as S

First Time Sender • What if FTS email is lost? • FTS & spammer generally indistinguishable • But perhaps FTS knows I who knows R • Email networks have small world properties • I makes shared secret SI with all known parties • FTS sends email to R • Posts multiple notifications • One for every SI it has learned

Other Issues • Reply-detection: • “in-reply-to” header may not always help • indirect checks based on text similarity • Reducing overhead: • post notifications only for “important” emails • delay posting in hope of receiving implicit ACK (reply) or NACK (bounce-back) • Mobility: • reply-based shared secret can be regenerated • web-mail • Can support mailing lists

SureMail Implementation • Reply detection heuristic for shared secret • Notification service • Centralized server running • Chord based DHT running • Notification posting, retrieving • Grab in/out bound email via Outlook MAPI call • No modification to Outlook binaries • XML notification put/get commands • Simple Win32 GUI

Lost! Not lost SureMail GUI • Client UI will see emails, will post & retrieve notifications • E.g. running on two machines netprofa@microsoft.com and netprofa@gmail.com

Notification Results

Summary • Email does get lost! • ~40 accounts, 158000 emails, 0.71%-0.91% silent loss • SureMail • Client based – unmodified email, servers, clients; no PKI • User intervention only on lost email • Keeps repudiability, privacy, asynchronous, spam & virus defense • Separate notification overlay robust • Simple, small message format • No virus, malware, spam filters needed • Provides failure independence • Status • ACM Hotnets 05; ACM Sigcomm 06 submission • Prototype implementation

SureMail Notification Overlay for Email Reliability

SureMail Notification Overlay for Email Reliability

Presentation Transcript

Image Overlay

Overlay Networking

Overlay Maker

Email Manager notification

Reliability for Teachers

Email Notification

DESIGN FOR RELIABILITY

Overlay Networks and Overlay Multicast

SureMail: Notification Overlay for Email Reliability

Market Notification Email List Process Change Update

Death Notification for Paramedics

Notification

RE-433 Email Notification to buyers

Notification

Solved: Yahoo email notification Problem

Decorative overlay

Overlay

Overlay Infrastructure

Infrastructure Primitives for Overlay Networks

Photoshop Overlay Online | Music Note Overlay & Sparklers Overlay