430 likes | 442 Views
Use HoneyStat!. Don’t become a stat…. Local Worm Detection using Honeypots Justin Miller Jan 25, 2007. Original Paper. HoneyStat: Local Worm Detection Using Honeypots By: D.Dagon, X.Qin, G.Gu, W.Lee, J.Grizzard, J.Levine, H.Owen Georgia Institute of Technology. Background.
E N D
Use HoneyStat! Don’t become a stat… Local Worm Detection using Honeypots Justin Miller Jan 25, 2007
Original Paper HoneyStat: Local Worm Detection Using Honeypots By: D.Dagon, X.Qin, G.Gu, W.Lee, J.Grizzard, J.Levine, H.Owen Georgia Institute of Technology
Background • Worm detection systems • Detection in local networks • HoneyStat nodes • Data collection • Improvements in HoneyStat
Worm Detection • Relied on artifacts incidental to worm infection • Measure incoming scan rates • Filter results for small networks • Increase data collection • Global monitoring centers • Doesn’t help local networks
Worm Detection • Proposition: use honeypots to improve accuracy of alerts (local intrusion detection) • Honeypot – computer system set up as a trap for attackers
Honeypot • Network decoy • Distracts attackers • Gather early warnings about new attacks • Facilitate in-depth analysis of adversary’s strategy
Honeypot Use • Gather info about how human attackers operate • Labor-intensive log (1:40) • 1 week per hour of data log • Virtual honeypots • Used to prevent OS fingerprinting
Honeypot Use • Detect/disable worms (honeyd) • Not ready for early warning IDS • Know attack pattern • Catch zero day worms – already know system vulnerability
Worm Detection • Worm propagation proposals • Model to study worm spreading • Early detection proposals • Statistical models analyze repeated outgoing connections • Worm info collected at routers
Objective • Early worm detection challenges • Large space to monitor • Coordinated responses • Focus on local networks • Detection using local honeypots • Lower false positive rate of worms
Infection Cycles • 3 actions result from infection • Memory events • Network events • Disk events • Describe worm installation on compromised system
Memory Events • Begins with probe for victim • Provides port • Victim shell listens on port 4,444 • Honeypot acknowledges incoming packets • Infection begins corrupting process
Network Events • Blaster shell remains open for only one connection • Instructs victim to download “egg” program • Honeypot initiates TCP or UDP traffic
Disk Events • Occur after Blaster “egg” is downloaded • Disk writes – become active after system reboot • Not all worms have disk writes
Data Capture • Most worms follow similar cycle • Traditional worm detection • Usually at start or end of cycle • Activity in middle of cycle can be tracked • Intrusion detection based on scan rates has high rate of noise
HoneyStat Node • Minimal honeypot created in an emulator • Covers large address space • Honeypots remain idle until HoneyStat event occurs
HoneyStat Data • Data recorded includes: • OS/patch level of host • Type of event • Trace file of all prior network activity
HoneyStat Events • Events forwarded to analysis node • Usually central server • Places alert events in queue • Perform statistical analysis
Data Analysis • Check if event corresponds to an active honeypot • Update previous event to include new event • Reset honeypot if event involved Network Events (DL an egg or initiating outgoing scans)
Data Analysis • Analysis node examines basic properties of the event • HoneyStat event is correlated with other observed events • Search for worm pattern • Objective: Zero-day worms • Statistical analysis identifies worm behavior
Logistic Regression • Analyzes port correlation • Non-linear transformation of linear regression model • Honeypot event is dichotomous • Awake (1) or asleep (0)
Logistic Regression • Model is binary expectation of the honeypot state • j: counter for honeypot events • i: counter for each individual port traffic for a specific honeypot
Logistic Regression • Measures inverse of time between honeypot events • Resolve equation after each event • Identify candidate ports that explain why honeypots become active • Also finds traffic patterns • Traffic measured for last 5 minutes
Logistic Analysis • Estimate βi,j coefficients (MLE) • Find coefficients that minimize prediction error • Find which variables significantly affect honeypot activity • Single variable = ALERT!
Practical Aspects • Properly identify worm outbreaks • Low false positive rate • Sample data from 6 honeypots active during Blaster worm
Worm Detection • Logit Analysis of Multiple HoneyStat Events
Worm Detection • Scans on ports 135, 139, 445 • No test can focus on 135 alone • Leads to pattern for 1 worm • Require: 10 sample events • Not sure of effective sample size
Benefits • Accurate data stream • Events result from successful attack • Reduces amount of data to process • Detects zero day worms • Detects ports worm enter/exit • Finds presence and also explains worm activity
False Positives • Identify wrong network traffic • Worm present, HoneyStat identifies wrong source • Repeated human breakins could be identified as a worm • Disregard manual breakins • These are more dangerous than robotic worms
Sample Data • Tested HoneyStat on the Internet • Injected a worm attack at Georgia Tech • Log from 2002-2004 • Random sample of 250+ synthetic honeypot events • 0 false positives
HoneyStat as IDS • Low false positive rate • Good for local IDS • Effectively detects worms using random scan techniques • Will attack honeypots
HoneyStat as IDS • What about non-random worms? • Ω = entire IPv4 space (232) • T = # of potential victims • N = total vulnerable machines • nt = # of victims at time t • s = scan rate
HoneyStat as IDS • ki+1 = sniT/Ω • # scans entering space T at time (i+1) • P = 1 – (1 - 1/T)ki+1 • Probability of host being hit
HoneyStat as IDS • Worm propagation equation: ni+1 = ni + [N - ni](1 – (1 - 1/T)sniT/Ω) • T and Ω are big, reducing to: • ni+1 = sni/Ω • Same as previous models
HoneyStat as IDS • Machines can be multihomed • Each searches 100’s of IP addresses • Local early worm detection • D = 211 • α = 0.25 • First victim found after 0.19% of vulnerable hosts are infected
Contributions • Statistical techniques used in worm detection • Previously applied time series-based statistical analysis • Logistic regression detects worm outbreaks
Weakness • Honeypot evasion • Attackers have worms detect and avoid honeypot traps • Attackers make observations about victim’s machine • Effective sample size unknown
Improvements • Reduce traffic length (logistic) measured < 5 minutes • Studies recent network events • Improve quality of data • Avoid linear identification of multiple worms • Best Subsets logistic regression • Study effective sample size
Conclusion • Further research for local IDS • Logistic regression detects worm outbreaks • Honeypots create accurate alert • 3 classes: memory, disk, network events • Logit analysis eliminates noise • Extensive data traces identifies worm activity
Questions ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?