300 likes | 414 Views
Using Failure Information Analysis to Detect Enterprise Zombies. Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Lab of Internet and Security Technology Northwestern University SRI International. Motivation. Increasing prevalence and sophistication of malware
E N D
Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Lab of Internet and Security Technology Northwestern University SRI International
Motivation • Increasing prevalence and sophistication of malware • Current solutions are a day late and dollar short • NIDS • Firewalls • AV systems • Conficker is a great example! • Over 10M hosts infected across variants A/B/C
Related Work • BotHunter [Usenix Security 2007] • Dialog Correlation Engine to detect enterprise bots • Models lifecycle of bots: • Inbound Scan / Exploit / Egg download / C & C / Outbound Scans • Relies on Snort signatures to detect different phases • Rishi [HotBots 07]: Detects IRC bots based on nickname patterns • BotSniffer [NDSS 08] • Uses spatio-temporal correlation to detect C&C activity • BotMiner [Usenix Security 08] • Combines clustering with BotHunter and BotSniffer heuristics • Focus on successful bot communication patterns
Objective and Approach • Develop a complement to existing network defenses to improve its resilience and robustness • Signature independent • Malware family independent – no prior knowledge on malware semantics or C&C mechanisms needed • Malware class independent (detect more than bots) • Key idea: Failure Information Analysis • Observation: malware communication patterns result in abnormally high failure rates • Correlates network and application failures at multi-points
Outline Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Evaluations Conclusions
Malware Failure Patterns • Empirical survey of 32 malware instances with long-lived traces (5 – 8 hours) • SRI honeynet, spamtrap and Offensive Computing • Spyware, HTTP botnet, IRC botnet, P2P botnet, Worm • Application protocols studied: • DNS, HTTP, FTP, SMTP, IRC • 24/32 generated failures • 18/32 generated DNS failures • Mostly NXDOMAINs • DNS failures part of normal behavior for some bots like Kraken and Conficker (generates new list of C&C rendezvous points everyday)
Malware Failure Patterns (2) • SMTP failures part of most spam bots • Storm, Bobax etc. • 550: recipient address rejected • HTTP failures • Generated by worms: Virut (DoS attacks) and Weby • Weby contacts remote servers to get configuration info • IRC failures • Channel removed from a public IRC server • Channel is full due to too many bots
Normal Applications Studied Webcrawler news.sohu.com, amazon.com, bofa.com, imdb.com P2P BitTorrent, Emule Video Youtube.com HTTP 304/Not Modified errors whitelisted
Normal Applications Studied • For video traffic, no transport-layer failures • Application level only “HTTP 304/Not modified” failures.
Empirical Analysis Summary • High volume failures are good indicators of malware • DNS failures (NXDomain messages) are common among malware • Malware failures tend to be persistent • Malware failure patterns tend to be repetitive (low entropy) while normal applications don’t
Outline Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Evaluations Conclusions
Netfuse Design • Netfuse: a behavior based network monitor • Correlates network and application failures • Wireshark and L7 filters for protocol parsing • Multi-point failure monitoring • Netfuse components • FIA (Failure Information Analysis) Engine • DNSMon • SVM-based Correlation Engine • Clustering
Gateway FIA Failure Scores DNSMon Multi-point Deployment Enterprise Network SVM Correlation Clustering
FIA Engine • Wireshark: open source protocol analyzer / dissector • Analyzes online and offline pcap captures • Supports most protocols • Uses port numbers to choose dissectors • Augment wireshark with L7 protocol signatures • Automated decoding with payload signatures • Sample sig for HTTP • http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d -~]*(connection:|content-type:|content-length:|date:)|post [\x09-\x0d -~]* http/[01]\.[019]
DNSMon • DNS servers typically located inside enterprise networks • Suspicious domain lookups can’t be tracked back to original clients from gateway traces • Especially true for NXDomain lookups • DNS Caching • DNSMon track traffic b/t clients and resolving DNS server • More comprehensive view of failure activity
Correlation Engine • Integrates four failure scores • Composite Failure Score • Failure Divergence Score • Failure Entropy Score • Failure Persistence Score • Malware failures tend to be long-lived • SVM-based correlation using Weka
Composite Failure Score • Estimates severity of each host based on failure volume • Consider hosts • Large # of application failures (e.g., > 15 per min) or • TCP RST, ICMP failures > 2 std. dev from mean of all hosts • Compute weighted failure score based on failure frequency of protocol
Failure Persistence Score Motivated by observation that malware failures tend to be long-lived Split time horizon into N parts and compute number of parts where failure occurs In our experiments N = 24
Failure Divergence Score • Measure degree of uptick in a host’s failure profile • Newly infected hosts would demonstrate strong and positive dynamics • EWMA Algorithm • α = 0.5 • For each host, protocol and date compute difference between expected and actual value. • Add divergence of each protocol for that host • Normalize by dividing with the maximum divergence value for all hosts
Failure Entropy Score • Measure degree of diversity in a host’s failure profile • Malware failures tend to be redundant (low diversity) • TCP: track server/port distribution of each client receiving failures • DNS: track domain name diversity • HTTP/SMTP/FTP: track failure types and host names • Ignore ICMP • Compute weighted average failure entropy score • Protocols that dominate failure volume of a host get higher weights
Outline Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Evaluations Conclusions
Evaluation Traces • Malware I: 24 malware traces from failure pattern study • Malware II: 5 new malware families (Peacomm, Mimail, Rbot, Bifrose, Kraken) + 3 trained families • Run for 8 to 10 hours each. • Malware III: 242 traces selected from 5000 malware sandbox traces based on duration & trace size • Institute Traces: Benign traces from well-administered Class B (/16) network with hundreds of machines (5-day and 12-day)
Performance Summary • Detection rate > 92% for traces I/II • Detection rate under 40% for trace III • Trace includes many types of malware including adware with failure patterns similar to benign applications • Traces are short, many under 15 mins • False positive rate < 5%
Clustering Results • Cluster detected hosts based on their failure profile • 24 instances belong to 8 different types of malware
Conclusions • Failure Information Analysis • Signature-independent methodology for detecting infected enterprise hosts • Netfuse system • Four components: FIA Engine, DNSMon, Correlation Engine, Clustering • Correlation metrics: Composite Failure Score, Divergence Score, Failure Entropy Score, Persistence Score • Useful complement to existing network defenses