240 likes | 347 Views
Fast Portscan Detection Using Sequential Hypothesis Testing. Authors : Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication : IEEE Symposium on Security and Privacy 2004 Presenter : Ryan Cunningham. A quick note.
E N D
Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE Symposium on Security and Privacy 2004 Presenter: Ryan Cunningham
A quick note • All images and equations taken directly from the publication
Port scanning • Network reconnaissance technique • Usually a prelude to an attack • Difficult to detect • Traffic difficult to distinguish from regular traffic • Stealth scans can occur very slowly • Some scans are legitimate • Search engine spiders • SSH, peer-to-peer applications, etc.
Previous detection techniques • Limit distinct connection attempts from one IP • Network Security Monitor • Snort • Also detects malformed packets • Limit failed connection attempts from one IP • Bro • Sensitive to service on specific port • Robertson et al. showed threshold very important
Previous detection techniques • Probabilistic model • Developed by Leckie et al. • Assesses typical traffic a machine receives • Also assesses the traffic a remote machine is likely to send • Combines these probabilities • If the result is too much, an alert is sounded • Generates too many false positives
Previous detection techniques • SPICE • Similar to probabilistic model • Used to detect low traffic “stealth” scans • Too computationally intensive for real world
Data set • Traffic from two sites • LBL • 6,000 hosts • Sparse address space 4.4% • ICSI • 200 hosts • Dense address space 42%
Data set • Anonymized TCP logs from Bro • Recorded for one 24 hour period • Bro NIDS flags for comparison and validation
Data set • Unsuccessful Login attempt analysis
Data set • Ratio of successful login attempts to unsuccessful login attempt analysis
Observations • Scans usually come from one host • Scans make lots of failed connection attempts and few successful connection attempts • Scans should ideally be detected quickly • False positive rate should be configurable
Sequential Hypothesis Testing • Proposed by Wald in the 1940’s • Method of doing repeated hypothesis testing as sequential data is gathered • Deciding between two hypotheses • Each time a data point arrives, decide • Accept H0 (in our case, benign traffic) • Accept H1 (in our case, port scan traffic) • Wait for more data (next connection attempt)
Sequential Hypothesis Testing • We specify parameters a and b • a> false positive rate • b< detection accuracy • We must estimate parameters q0 and q1 • q0 probability a benign connection attempt is successful • q1 probability a scanner connection attempt is successful
Sequential Hypothesis Testing • For each test, we compute the likelihood ratio: • Where
Sequential Hypothesis Testing • Compare likelihood ratio to: • If • L < h0 then this is benign traffic • L > h1 then this is scan traffic • Otherwise, wait for another connection
Sequential Hypothesis Testing • We can estimate the expected number of connections required to decide with: • Derivation is long and messy
Results • Efficiency = true positive / total reported positive • Effectiveness = true positive / total actually positive
Results • Comparison with Snort and Bro • N bar = average number of local hosts scanned before decision is made
Contributions • Extremely fast port scan detection algorithm • High accuracy • Low false positive rate • Sound statistical foundation • Soundly evaluate the weaknesses of their approach • Good use of appendixes • Cure for insomnia
Weaknesses • Buffer of activity • Attacker can spoof multiple IP addresses • How is filled buffer dealt with? • Flush buffer • Attacker can use this to hide scan activity • Maintain larger buffer • Attacker can keep going until system crashes • Distributed port scans undetectable • Botnets are increasing in popularity
Weaknesses • Test assumes independent connection attempts • As suggested in paper, an attacker could exploit knowledge of the system to connect to some systems while doing surveillance on others • No real time testing conducted, only simulation • Reasoning is a little circular • Poor use of language
Improvements • Implement and test in real time • Perform suggested improvements in paper • Differentiate between different services • Differentiate between rejected and unanswered connection attempts • Use a honeypot to see if complete three way hand shake is completed (to detect spoofed IPs) • Should have kept some of the data away as a sort of test data set