310 likes | 460 Views
An empirical approach to modeling uncertainty in Intrusion Analysis. Xinming (Simon) Ou 1 S. Raj Rajagopalan 2 Sakthi Sakthivelmurugan 1. 1 – Kansas State University, Manhattan, KS 2 – HP Labs, Princeton, NJ. A day in the life of a real SA. system administrator. Abnormally high traffic.
E N D
An empirical approach to modeling uncertainty in Intrusion Analysis Xinming (Simon) Ou1 S. Raj Rajagopalan2 Sakthi Sakthivelmurugan1 1 – Kansas State University, Manhattan, KS 2 – HP Labs, Princeton, NJ
A day in the life of a real SA system administrator Abnormally high traffic TrendMicro server communicating with known BotNet controllers Seemingly malicious code modules Key challenge How to deal with uncertainty in intrusion analysis? Found open IRC sockets with other TrendMicro servers Network Monitoring Tools memory dump netflow dump These TrendMicro Servers are certainly compromised!
An empirical approach • In spite of the lack of theory or good tools, sysadmins are coping with attacks. • Can we build a system that mimics what they do (for a start): • An empirical approach to Intrusion Analysis using existing reality • Our goal: • Help a sysadmin do a better job rather than replace him
High-confidence Conclusions with Evidence Internal model ReasoningEngine Targeting subsequent observations Mapping observations to their semantics IDS alerts, netflow dump, syslog, server log … Observations
Capture Uncertainty Qualitatively • Arbitrarily precise quantitative measures are not meaningful in practice • Roughly matches confidence levels practically used by practitioners
High-confidence Conclusions with Evidence Internal model ReasoningEngine Mapping observations to their semantics Targeting subsequent observations IDS alerts, netflow dump, syslog, server log … Observations
Observation Correspondence mode Observations Internal conditions what you can see what you want to know p obs(anomalyHighTraffic) int(attackerNetActivity) l obs(netflowBlackListFilter(H, BlackListedIP)) int(compromised(H)) l obs(memoryDumpMaliciousCode(H)) int(compromised(H)) l obs(memoryDumpIRCSocket(H1,H2)) int(exchangeCtlMessage(H1,H2))
High-confidence Conclusions with Evidence Internal model ReasoningEngine Targeting subsequent observations Mapping observations to their semantics IDS alerts, netflow dump, syslog, server log … Observations
Internal Model Logical relation among internal conditions direction of, inference mode Condition 1 Condition 2 Condition 1 infers Condition 2 f, p int(compromised(H1)) int(probeOtherMachine(H1,H2)) f, l int(sendExploit(H1,H2)) int(compromised(H2)) b, p int(sendExploit(H1,H2)) int(compromised(H2)) b, c int(compromised(H1)) int(probeOtherMachine(H1,H2))
High-confidence Conclusions with Evidence Internal model ReasoningEngine Targeting subsequent observations Mapping observations to their semantics IDS alerts, netflow dump, syslog, server log … Observations
Reasoning Methodology • Simple reasoning • Observation correspondence and internal model are inference rules • Use inference rules on input observations to derive assertions with various levels of uncertainty • Proof strengthening • Derive high-confidence proofs from assertions derived from low-confidence observations
Example 1 Observation Correspondence l obs(memoryDumpIRCSocket(H1,H2)) int(exchangeCtlMessage(H1,H2)) obsMap int(exchangeCtlMsg(172.16.9.20, 172.16.9.1), l) obs(memoryDumpIRCSocket(172.16.9.20, 172.16.9.1))
Example 2 Internal Model l Int rule int(compromised(172.16.9.20), ) obsMap l int(exchangeCtlMsg(172.16.9.20, 172.16.9.1), ) obs(memoryDumpIRCSocket(172.16.9.20, 172.16.9.1))
Proof Strengthening f is certainly true proof strengthening f is likely true f is likely true Observations: O1 O2 O3
Proof Strengthening l A strengthen A c A l p
Proof Strengthening strengthenedPf c int(compromised(172.16.9.20), ) intR int(compromised(172.16.9.20),l) obsMap int(exchangeCtlMsg(172.16.9.20, 172.16.9.1), l ) obs(memoryDumpIRCSocket(172.16.9.20, 172.16.9.1)) obsMap int(compromised(172.16.9.20),l) obs(memoryDumpMaliciousCode(’172.16.9.20’)) strengthen( l, l ) = c
Evaluation Methodology • Test if the empirically developed model can derive similar high-confidence trace when applied on different scenarios • Keep the model unchanged and apply the tool to different data sets
SnIPS(Snort Intrusion Analysis using Proof Strengthening) Architecture Done only once Observation Correspondence Internal Model Snort Rule Repository User query, e.g. which machines are “certainly” compromised? Reasoning Engine High-confidence answers with evidence (convert to tuples) pre-processing Snort alerts
Snort rule class type Internal predicate mapped from “classtype” alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"WEB-MISC guestbook.pl access”;uricontent:"/guestbook.pl”; classtype:attempted-recon; sid:1140;) obsMap(obsRuleId_3615, obs(snort(’1:1140’, FromHost, ToHost)), int(probeOtherMachine(FromHost, ToHost)), ?).
Snort rule documents • Hints from natural-language description of Snort rules Impact:Information gathering and system integrity compromise. Possible unauthorized administrative access to the server. Possible execution of arbitrary code of the attackers choosing in some cases. Ease of Attack:Exploits exists obsMap(obsRuleId_3615, obs(snort(’1:1140’, FromHost, ToHost)), int(probeOtherMachine(FromHost, ToHost)), ). ? l obsMap(obsRuleId_3614, obs(snort(’1:1140’, FromHost, ToHost)), int(compromised(ToHost)), p)
Automatically deriving Observation Correspondence • Snort has about 9000 rules. • This is just a base-line and needs to be fine-tuned. • Would make more sense for the rule writer to define the observation correspondence relation when writing a rule
Data set description • Treasure Hunt (UCSB 2002) – 4hrs • Collected during a graduate class experiment • Large variety of system monitoring data: tcpdump, sys log, apache server log etc. • Honeypot (Purdue, 2008) – 2hrs/day over 2 months • Collected for e-mail spam analysis project • Single host running misconfigured Squid proxy • KSU CIS department network 2009 – 3 days • 200 machines including servers and workstations.
Some result from Treasure Hunt data set 192.168.10.90 was certainly compromised! | ?- show_trace(int(compromised(H), c)). int(compromised(’192.168.10.90’),c) strengthenedPf int(compromised(’192.168.10.90’), p) intRule_1 int(probeOtherMachine(’192.168.10.90’,’192.168.70.49’), p) obsRulePre_1 obs(snort(’122:1’,’192.168.10.90’,’192.168.70.49’,_h272)) int(compromised(’192.168.10.90’),l) intRule_3 int(sendExploit(’128.111.49.46’,’192.168.10.90’), l) obsRuleId_3749 obs(snort(’1:1807’,’128.111.49.46’,’192.168.10.90’,_h336)) A probe was sent from 192.168.10.90 An exploit was sent to 192.168.10.90
Related work • Y. Zhai et al. “Reasoning about complementary intrusion evidence,” ACSAC 2004 • F. Valeur et al., “A Comprehensive Approach to Intrusion Detection Alert Correlation,” 2004 • Goldman and Harp, "Model-based Intrusion Assessment in Common Lisp", 2009 • C. Thomas and N. Balakrishnan, “Modified Evidence Theory for Performance Enhancement of Intrusion Detection Systems”, 2008
Summary • Based on a true-life incident we empirically developed a logical model for handling uncertainty in intrusion analysis • Experimental results show • Model simulates human thinking and was able to extract high-confidence intrusion • Model empirically developed from one incident was applicable to completely different data/scenarios • Reduction in search space for analysis
Future Work • Continue the empirical study and improve the current implementation • Establishing a theoretical foundation for the empirically-developed method • Modal logic • Dempster-Shafer Theory • Bayes Theory
Thank you Questions? sakthi@ksu.edu xou@ksu.edu
Summarization Compact the information entering reasoning engine Group similar “internal condition” into a single “summarized internal condition”
Output from CIS int(compromised('129.130.11.69'),c) strengthenedPf int(compromised('129.130.11.69'),l) intRule_1b int(probeOtherMachine('129.130.11.69','129.130.11.12'),l) sumFact summarized(86) int(compromised('129.130.11.69'),l) intRule_3f int(sendExploit('129.130.11.22','129.130.11.69'),c) strengthenedPf int(sendExploit('129.130.11.22','129.130.11.69'),l,) sumFact summarized(109) int(skol(sendExploit('129.130.11.22','129.130.11.69')),p) IR_3b int(compromised('129.130.11.69'),p) sumFact summarized(324)