240 likes | 250 Views
Introducing Email Permission Keys (EPK) - a unique key embedded in an email address that helps reduce false positives in spam filters and maintains user trust.
E N D
Email Permission Keys Adrian McElligottCEO Geobytes, inc.Boston, March 2008
We have failed! They have failed! The real cost of lost messages • If the objective of today's Spam filters is to reduce the users exposure to Spam, then for most users they have failed. • Exposing a user to spam in their junk folder is still exposing the user to Spam. • If the user is routinely checking their junk folder then the filter is of diminished value. New Term: Lost Message Rate
Why are users still checking their junk folders? • False positives • False positives • False positives • False positives • They are bored • Because Spam Filters need them to
User Trust Oscillation The problem with dynamic re-training via "is Spam" and "is not Spam" buttons.
Introducing Email Permission Keys • Email Permission Keys (EPK) - are a unique key that is embedded in an email address in such a way that it is likely to be retained during normal use, and is therefore available to be extracted at a later date when that email address is used to send a message back to the original user.
What are CaseKeys? • They are a type of email permission key, that use the CAsE of the LeTTerS that make up an email address to embed a unique key into every instance of that email address.
How CaseKeys Help • CaseKeys automates the "Is not Spam" button. • CaseKeys identifies messages that would otherwise be false positives • CaseKeys provides the “feedback” required to dynamically train the filter in real-time.
Conclusion • CaseKeys are a proactive approach to the false positives problem. • CaseKeys automate the “Is not Spam” button. • CaseKeys allow filters to maintain the user’s trust at levels that would be otherwise unsustainable.
Questions • Q1. How does this affect the user? • Q2. What proportion of incoming messages would be likely to contain a CaseKey? • Q3. How does this reduce spam? • Q4. How does CaseKeys work with systems that do not preserve all or part of the character casing? • Q5. Does publishing a CaseKeyed email address on your web site result in Spam being falsely white listed? • Q6. What is the advantage over white listing outbound recipients? • Q7. How does this reduce solicited bulk email false positives? • Q8. How does this reduce fist contact false positives?
The End • The remaining slides are just here to assist with answering questions, and generating discussion.
Correction • Common misconception.
Q. How does this affect the user? • The technology is transparent to the user - the user does not have to be concerned with the Case - this happens for them automatically. • No one has to “type in” the CaseKey. CaseKeys are distributed automatically with outgoing messages. • The CaseKey is automatically preserved when the recipient clicks reply, or adds the sender to their address book. • Most modern mail readers display the sender’s Display Name, rather than email address, so often recipients don’t even see the CaseKeyed representation of the sender’s email address.
What proportion of incoming legitimate messages would contain a CaseKey? • Geobytes conducted a two year trial of CaseKeys. Approximately 90% of the messages received over the trial period contained a valid CaseKey. • How people obtain an email address can be sorted into 3 categories. • Through a message that they receive - which they may reply to or add to their contacts - either way the CaseKey is preserved. • From an online resource - web page, news group, ezine, forum etc in which case a CaseKey would be present - Note: Public CaseKeys auto-expire. • They type it in - from a business card, over the phone, from off-line media. Direct "typing in" of an email address is error prone and unreliable - so users tend to avoid it. • We have found that the vast majority of email messages contain recipient email addresses that fall into one of the first two "categories of acquisition" and therefore benefit from CaseKeys technology. Very few messages contain addresses that have been typed in, and these are the only category that does not benefit from CaseKeys.
Q. How does this reduce spam? • Whenever a user has to check their Spam folder, then they are still being exposed to all of their Spam - only the folder name is different. • CaseKeys may well be the difference between a system that users trust and one that they don't - the difference between exposure to all of the Spam, or no Spam.
Q. What about systems that do not preserve all of the character casing? • Most email system (>85%) do preserve the case of the entire address – which is adequate to reduce a filters false positive rate by over 80% and to automate filter training. • If 100% preservation is required then we use a hybrid of CaseKeys with “Display Name Annexing” • Message that do not contain a valid CaseKey are not disadvantaged by the CaseKeys subsystem, they just don’t directly benefit from it. They do however indirectly benefit from the filter tuning that automating the “Is not Spam” button provides. • RFC 2821 states that the local-part of an e-mail address – which includes the ‘display name’ and any ‘plus addressing’, "MUST BE treated as case sensitive".
Q. Does publishing a CaseKey result in Spam being falsely white listed? • CaseKeys that are published on web pages are set to auto expire. • In the event that a CaseKey does fall in to the wrong hands and did result in a False Negative, then the user clicking “Is Spam” would invalidate the CaseKey.
Q. How does this reduce fist contact false positives? • User interface may provided a facility to issues the user with a unique CaseKeys for the purpose of publication on web page. CaseKeys that are issued for this purpose are set to auto expire. • An AJAX service automatically cycles the CaseKeys on a web page.
The advantage of CaseKeys over just white listing outbound recipients • You can expire CaseKeys, and while you can blacklist an email address you can’t issue the compromised user a new email address. • CaseKeys embed the key in the senders address, which propagates when the message is forwarded to a third user. • Many users have multiple addresses feeding to the same inbox, so a reply may come from a different email address.
Q. How does this reduce solicited bulk email false positives? • A facility can be provided whereby the user can manually issue a unique CaseKey for the purpose of registering with a newsletter or online service.
New TermLost Message Rate (LMR) • Is the percentage of legitimate messages that are mistaken for Spam. Traditionally the industry has used the statistical term “false positive” which does not truly reflect the proportion of legitimate messages that the filter is loosing. Return To: The real cost of lost messages
New Term –Display Name Annexing (DNA) • Is a type of email permission key that appends or encodes a unique key within the Display Name portion of the email address. • A typical display name key may look something like this: "John Smith 12345" <john.smith@example.com> where 12345 is the key.
Quick Quiz Question 1 If you receive 100 legitimate messages and your spam filter misplaces one of them in your Spam folder, then what is the filter's false positive rate? • One in a Billion • One in a Million • One in a Thousand • Could be any of the above depending on how much Spam you get.
Quick Quiz Question 2 What is the difference between being exposed to Spam in your inbox and being exposed to Spam in your Spam folder? • It takes longer to sort through two folders. • It is quicker to sort through two folders. • It is a perception, feel-good thing, it is less aggravating to be exposed to Spam in your Spam folder. • Either way you are still exposed to Spam