420 likes | 646 Views
SureMail: Notification Overlay for Email Reliability. Sharad Agarwal & Venkat Padmanabhan Microsoft Research Dilip Antony Joseph UC Berkeley HotNets 2005. Silent Email Loss. Silent email loss: email “vanishes” without sender/recipient knowledge
E N D
SureMail: Notification Overlay for Email Reliability Sharad Agarwal & Venkat Padmanabhan Microsoft Research Dilip Antony Joseph UC Berkeley HotNets 2005
Silent Email Loss • Silent email loss: • email “vanishes” without sender/recipient knowledge • missed opportunities, misunderstanding, or worse • Nontrivial problem • anecdotal evidence • measurement studies • 0.69% loss rate [LM 04] • 0.1-5% loss rate [AB 05] • commercial offerings to address the problem • e.g., Pivotal Veracity, Zenprise
HotNets air ticket confirmation “We have sent it through again. If you do not receive it with in an hour or two, please let us know.” Funding proposal "No I never got and I never acked it… My last mail from you was [on] 3/10/2004.” IMC 2005 decision notification “I recd reviews for one paper (#X) but not that of #Y.” IMAP server upgrade problems “Some unanticipated migration problems occurred that may have caused some lost or delayed email.”
Silent Email Loss • Silent email loss: • email “vanishes” without sender/recipient knowledge • missed opportunities, misunderstanding, or worse • Nontrivial problem • anecdotal evidence • measurement studies • 0.69% loss rate [Lang & Moors 2004] • 0.1-5% loss rate [Afergan & Beverly 2005] • commercial offerings to address the problem • e.g., Pivotal Veracity, Zenprise
Silent Email Loss • Why email loss? • spam filtering: big problem aggressive filtering • MS: 90% of emails discarded before hitting user mailboxes • AOL: 100 emails per month to maintain IP white-listing • server failures and upgrades • SMTP is not end-to-end reliable • (Non-)Delivery status notifications • compounds spam problem • raises privacy concerns • So email loss is often silent
Fixing the Problem • Improve the email delivery infrastructure • more reliable servers • e.g., cluster-based (Porcupine [Saito ’00]) • server-less systems • e.g., DHT-based (POST [Mislove ’03]) • total switchover might be risky • “Smarter” spam filtering • moving target mistakes inevitable • non-content-based filtering still needed to cope with spam load
SureMail • SureMail addresses the problem from the outside • add separate notification overlay • email delivery infrastructure left undisturbed • users can benefit without operator cooperation • Design goals: • minimize demands on infrastructure and users • preserve asynchronous operation and privacy (no worse than it is today) • maintain defenses against spam and viruses • minimize overhead
Basic Operation Missing Items Folder [S,H(M)] Request lost message Sender S Recipient R GetNotifications Notification server
Notification Overlay • Decentralized • limited collusion among the constituent nodes • Efficient notification server lookup • e.g., R H(R) in a DHT setup • Agnostic to actual implementation • end-host-based (e.g., always-on user desktops) • infrastructure-based (e.g., “NX servers”)
Challenges • Privacy • information about users’ email habits could be leaked • Notification spam • spammers can spoof notifications and burden users • annoyance attacks discredit notifications in general • Even the notification infrastructure isn’t trusted • No universal PKI for email users
SureMail Goals • Protect the recipient’s identity • attacker shouldn’t be able to retrieve R’s notifications or learn the volume of notifications intended for R • Protect the sender’s identity • attacker shouldn’t be able to learn S’s identity or monitor the volume of notifications posted by S • Block notification spam • attacker shouldn’t be able to spoof notifications
Assumptions • No email eavesdroppers • privacy is moot otherwise • Limited collusion among notification nodes • needed only to avoid leaking notification volume info
Key Mechanisms #1: Email-based handshake #2: Decoupled registration and notification #3: Email-based shared secret #4: Reply-based shared secret
#1: Email-based handshake Goal: prevent hijacking of R’s identity Only R can receive emails sent to R • One-time operation for initial registration • Send email to R to establish registration secret shared with the notification overlay • R can then use registration secret to authenticate itself to the notification overlay
#2: Decoupled registration & notification Goal: prevent snooping on recipient identity Limited collusion among notification nodes • Registration at Dreg=H(H(R)) • Notification posted at Dnot=H(R) • R contacts Dnot to retrieve notifications for H(R) • Dnot can find Dreg without knowing R • Neither Dnot nor Dreg can associate notifications with R, unless they collude
#3: Email-based shared secret Goal: prevent snooping on sender identity Email Mold from S to R in known only to S and R • H(Mold) could serve as implicit identifier of S to R • But it doesn’t quite serve as authenticator for S: • Dnot knows H(Mold), so it could spoof notifications from S • even other attackers could do so by first sending Mspam purporting to be from S
#4: Reply-based shared secret Goal: block spoofing of notifications Users rarely have conversation with spammers • R remembers (hashes of) recent emails from S that it has replied to • If S receives a reply to Mold it had sent R, Mold can serve as a shared secret between S and R • S could use H1(Mold) as an implicit identifier… • … and H2(Mold) as an authenticator • Hard for a spammer (even Dnot) to spoof
Putting it all together Missing Items Folder GetNotifications Request lost message [ H1(M) ,H2(Mold)] =H1(Mold) Register Verify Sender S Recipient R Dnot=H(R) Dreg=H(H(R))
Other issues • Reply-detection: • “in-reply-to” header may not always help • indirect checks based on text similarity • Reducing overhead: • post notifications only for “important” emails • hold off on posting notification in the hope of receiving an implicit ACK (reply) or NACK (bounce-back) • First-time “legitimate” senders: • they are indistinguishable from spammers • Mobility: • reply-based shared secret enables secure migration without state transfer
Status • Ongoing measurement experiment • Design being refined • Implementation in the works
Discussion #1: Should the notification system be folded into the email infrastructure? • Separation is advantageous: • provides failure independence • keeps the notification layer simple • small, fixed format notifications don’t require the same kind of processing as virus-laden email • provides engineering convenience
Discussion #2: Is there a social benefit to silent email loss because of the plausible deniability it provides? • Any such benefit is far outweighed by the costs • Should cars be slightly unreliable because of the excuse it would give people when they miss an engagement? • It is the asynchronous nature of email that is key
Anecdotes Funding proposal "No I never got and I never acked it… My last mail from you was [on] 3/10/2004.” Response to self-managing networks summit invitation "Yesterday's email did not bounce back, wonder where it is!” IMC 2005 decision notification “I recd reviews for one paper (#X) but not that of #Y.” IMAP server upgrade problems “Some unanticipated migration problems occurred that may have caused some lost or delayed email.”
Basic Operation • Senders post notifications to overlay, in addition to sending emails as usual • Intended recipients periodically download notifications intended for them • A notification without a matching email suggests possible email loss
Putting it all together • Registration: • R contacts Dreg=H(H(R)) to register • Dreg sends R an email to set up registration secret • Posting notifications: • upon sending email M to R, S posts notification N to Dnot=H(R) • N = [Encrypt(H2(Mold), H1(M)), H1(Mold)]
Putting it all together • Retrieving notifications: • R asks Dnot for the notifications corresponding to H(R) and presents evidence of registration secret • Dnot contacts Dreg to verify evidence, before returning the notifications to R • R uses H1(Mold) to identify Mold and compute the encryption key H2(Mold) • R discards bogus notifications and checks for missing emails corresponding to the remaining notifications
#1: Protecting the recipient’s identity • Goal: • only R should be able to retrieve notifications intended for it • attackers shouldn’t be able to learn even the volume of notifications intended for R • Key idea: • Email-based handshake: • prevents “hijacking” of R’s identity • Decoupling registration from notification: • prevents bad DHT node from associating notifications with R
#1: Protecting the recipient’s identity • Registration: • R contacts Dreg = H(H(R)) to register • Dreg sends R an email to set up a shared secret • Posting notifications: • Upon sending email M to R, S posts notification N = [H(M),S] to Dnot = H(R) • Retrieving notifications: • R presents an authenticator to Dnot and asks for the notifications corresponding to H(R) • Dnot contacts Dreg to verify the authenticator, before returning the notifications • R checks if emails are missing and presents the corresponding S to the user
SureMail Goals #1: protecting the recipient’s identity #2: protecting the sender’s identity #3: blocking notification spam
#2: protecting the sender’s identity • Goal: • attackers shouldn’t be able to learn S’s identity or monitor the volume of notifications posted by S • clearly N = [H(M),S] won’t do • Key idea: email-based shared secret • assuming no eavesdroppers, an email Mold from S to R in known only to S and R • so H(Mold) could serve as an authenticator and identifier of S to R
#2: protecting the sender’s identity • Posting notifications: • S’s identity is made implicit in the notification • N = [H(M), H(Mold)] • Retrieving notifications: • R stores hashes of emails received (recently) from various senders • it searches for H(Mold) to identify S • if H(Mold) can’t be found, the notification is ignored
SureMail Goals #1: protecting the recipient’s identity #2: protecting the sender’s identity #3: blocking notification spam
#3: blocking notification spam • Goal: • prevent spammers from posting bogus notifications and burdening users with false reports of email loss • A malicious DHT node could itself be a spammer
#3: blocking notification spam • Current scheme is vulnerable to attack • Easy for malicious DHT node Dnot to generate spam: • Dnot has ready access to H(Mold) • it could spam R with bogus notifications purporting to be from S • Another spammer (say X) could also do so: • X sends R a message Mspam purporting to be from S • X can then use H(Mspam) as the implicit identifier in notifications • R may assume it is really missing an email from S • Spammer gain indirectly by annoying R and S, thereby discrediting notifications in general
#3: blocking notification spam • Key idea: reply-based shared secret • users rarely engage in conversations with spammers • so if S receives a reply to a message Mold that it had sent R, S could use H(Mold) as an implicit identifier • hard for a spammer to spoof the identifier • special construction to prevent spoofing by Dnot • Notification format: • N = [Encrypt(H”(Mold), H(M)), H(Mold)] • R uses H(Mold) to identify Mold and compute the encryption key H”(Mold)
PKI-based Design • R↔DA: RegisterRecipient(R,A) • S↔DN: PostNotification(H(R),N) • N = [H(M), TTL, E(Rpb,S), Sg(Spv,H(M),TTL)] • R↔DN: CheckNotification(H(R),A) • DN: find DA = H(H(R)) • DN↔DA: AuthenticateRequest(H(R),A) • DN↔R: ReturnNotifications() • R: identify notifications corresponding to missing email and notify user if S is trusted
Notation • Sender S • Recipient R • Message M • Notification N • Crypto operations: • H: hash(…) • Sg: sign(key,…) • E: encrypt(key,…)
Overhead • duplication of effort with respect to email delivery #3: preserve privacy of email content • attacker shouldn’t be able to learn email content
Silent Email Loss • Silent email loss is a non-negligible problem • silent loss = no notification to sender or recipient • imposes significant cost, degrades user experience • Several causes • spam filtering • MS IT: 90% of emails discarded off the bat • failures and upgrades • Measurement studies • 0.5-1.0% silent loss ([Lang ’04], [Afergan ’05]) • ongoing measurement study at MSR • quantify email delays & loss across ~25 domains
SureMail Overview • SureMail adds separate notification system • orthogonal to email delivery infrastructure • email is still subject to checking by spam filters, virus scanners • doesn’t create backdoor for malware • bounds worst case performance • Asynchronous operation • senders post notifications = hash(message) • recipients check for them • preserves the privacy of email (unlike read receipts)
SureMail Design • Reply-based shared secret • A sends an email M to B (AB) • B replies to A’s email (BA) • A uses hash(M) to “prove” to B that it is legitimate • shared secret is continually refreshed • Reply-based shared secret helps: • avoid burdening users with notification spam • maintain privacy of notifications