370 likes | 507 Views
An Effective Defense Against Spam Laundering. Mengjun Xie, Heng Yin, Haining Wang Presented by Dustin Christmann March 4, 2009. Outline. Introduction Spam Laundering Anti-Spam Techniques Proxy-Based Spam Behavior DBSpam DBSpam Evaluation Potential Evasions. Introduction. What is spam?
E N D
An Effective Defense Against Spam Laundering Mengjun Xie, Heng Yin, Haining Wang Presented by Dustin Christmann March 4, 2009
Outline • Introduction • Spam Laundering • Anti-Spam Techniques • Proxy-Based Spam Behavior • DBSpam • DBSpam Evaluation • Potential Evasions
Introduction What is spam? Classic definition: a canned precooked meat product made by the Hormel Foods Corporation, introduced in 1937. “SPAM” stands for “SPiced hAM” Modern definition: the abuse of electronic messaging systems to send unsolicited bulk messages indiscriminately.
Introduction So how did we get from one definition to the other? A 1970 Monty Python sketch, entitled “Spam.”
Spam Laundering Email relay MTA Proxy MTA
Anti-Spam Techniques Three main categories: • Recipient-oriented techniques • Sender-oriented techniques • HoneySpam
Recipient-oriented Techniques Two main categories: • Content-based techniques • Non-content-based techniques
Content-Based Techniques • Email address filters • Heuristic filters • Machine-learning based filters
Non-content-based Techniques • DNSBLs • MARID • Challenge-Response • Tempfailing • Delaying • Sender Behavior Analysis
Sender-oriented Techniques • Usage regulation • Cost-based approaches
HoneySpam • Based on honeyd • Set up • Fake web servers • Fake open proxies • Fake relays • Log the users of these fake servers as spam sources
Proxy-based Spam Behavior Normal email transmission MTA Corporate / campus / home network Router
Proxy-based Spam Behavior Proxy-based Spam MTA Router Corporate / campus / home network
Connection Correlation • One-to-one mapping between upstream and downstream connections • In normal email transmission, there’s only one. • Problems • Upstream encryption • Overhead • Timing
Packet Symmetry • Message symmetry • SMTP message from downstream connection results in TCP message to upstream connection • Packet symmetry • One packet from downstream connection results in one packet to upstream connection • Exceptions
DBSpam Goals: • Fast detection of spam laundering with high accuracy • Breaking spam laundering via throttling or blocking after detection • Support for spammer tracking and law enforcement • Support for spam message fingerprinting • Support for global forensic analysis
Deployment of DBSpam • At a network vantage point where it can monitor the bi-directional traffic Single-homed network:
Deployment of DBSpam Multi-homed network
Design of Spam Laundering Detection • With proxy-based spam transmission, number of incoming SMTP reply packets = number of outgoing TCP packets • Possible for this to occur with normal traffic, but very seldom • Sequential Probability Ratio Test (SPRT) is used
SPRT • Can be viewed as a one-dimensional “random walk” starting between two boundaries • One boundary defines “spam connection” • Other boundary defines “not a spam connection”
SPRT • Each observation pushes the walk in one direction or the other • Observation of correlated SMTP-TCP packets pushes walk toward “spam connection” • Observation of no correlation pushes walk toward “no spam connection” • When the walk hits either boundary, test ends
SPRT • Average number of required observations to reach a determination depends on four variables: • α* (the desired probability of false positives) • β* (the desired probability of false negatives) • θ1(the distribution of positive correlation) • θ0 (the distribution of negative correlation)
SPRT E[N|H1] vs. θ0 and α* (θ1 = 0.99, β* = 0.01)
Noise Reduction • Maintain a set of external IP addresses that appear for each time • In the consecutive M time windows, single out the external IP addresses that appear at least K times • Can further reduce the incidence of false positives dramatically, depending on the selection of M and K
DBSpam Evaluation • Evaluation at College of William & Mary • Two off-campus PCs as spam sources • Two PCs in different campus subnets running SOCKS and HTTP proxies • Spam “sink” in dark net • Traces run in two different months • N-* includes no spam traffic • S-*-C encrypted spam, S-*-A and S-*-B unencrypted spam
DBSpam Evaluation SPRT Detection Time
DBSpam Evaluation Distribution of N|H0
DBSpam Evaluation CDF of Detection Time for SPRT
DBSpam Evaluation Accuracy of SPRT
DBSpam Evaluation Accuracy of SPRT after noise reduction
DBSpam Evaluation Resource Consumption
Potential Evasions • Fragmenting SMTP replies at the proxy • Change the 1:1 packet symmetry into 1:2 or 1:3 • Inserting random delays at the proxy • Randomly change the 1:1 packet symmetry into 1:0 or 1:2
Strengths • Simple to implement • Moves spam detection closer to source, reducing network traffic • Thwarts encryption • Detects proxy-based spam quickly • Few false positives
Weaknesses • Easy to evade by breaking packet symmetry • Can be thwarted by short SMTP dialogs • Must be installed at ISP edge • Too resource intensive for imbedded systems