430 likes | 650 Views
Analysis and Classification of Botnet Command and Control Communications. By Evan Painter Presented on May 4, 2012. Outline. Scope Background C&C Analysis Feature Extraction and Classification Engine Classification Results Conclusion. Thesis Scope.
E N D
Analysis and Classification of Botnet Command and Control Communications By Evan Painter Presented on May 4, 2012
Outline • Scope • Background • C&C Analysis • Feature Extraction and Classification Engine • Classification Results • Conclusion
Thesis Scope Analyze C&C traffic for three major types of botnetDevelop a proof of concept detection mechanism based solely on C&C trafficPresent results that show the mechanism is feasible
ScopeBackgroundC&C AnalysisFeature Extraction and Classification EngineClassification ResultsConclusion
A botnet is: "A coordinated group of malware instances that are controlledvia C&Cchannels" Gu et al. [4]
Types of Botnets IRC-based: Traditional, simple to build and maintain, relatively easy to detect, centralized C&C server, push structureHTTP-based: Still has centralized server, blends better with ordinary traffic, pull structureP2P-based: P2P network, decentralized C&C, push and pull structure
Botnet Lifecycle Initial Infection Bot Installation Connection Command and Control Update/ Maintenance
Detection Capabilities • Unknown Bot Detection • Encrypted Bot Detection • Protocol and Structure Independent • C&C Only Detection • Lone Bot Detection • Real-Time Detection • Network-Based
Flow-Based Temporal Anomaly Detection [Masudet al.] • Protocol and feature independent – untested • Multiple log file correlation – requires host-based system • Inbound and outbound monitoring • Observable C&C features • Bot-response • Bot-other • Bot-app
BotHunter [Gu et al.] • Built around the Snort IDS • Inbound and outbound monitoring • Dialog Correlation • Combines results of several sensors • Requires multiple alerts
ScopeBackgroundC&C AnalysisFeature Extraction and Classification EngineClassification ResultsConclusion
Example Botnets Agobot3 (IRC) : Well known, large family of botnets Zeus (HTTP): Very successful “crimeware toolkit,” professionally designed and maintained Immonia (P2P): “academic” example, allows for addition of service modules (what you want it to do)
Agobot (IRC) Analysis • Commands through IRC messages • Not encrypted (but could be added) • Variety of unusual ports • Push-style botnet
Agobot Analysis • Ordinary IRC connection • Timing between commands and bot responses is fast • The timing of a command and a bot initiating a scan is slow
Zeus (HTTP) Analysis • HTTP POST messages to retrieve commands • Encrypted payload data • Pull-style botnet • Bots act like a web browser connecting to a web server (the C&C server) • Length and timing of POST messages are very consistent • Responses are fast
Zeus Communication • All Zeus Communication uses HTTP POST messages
Zeus Periodic Comm. • About every 20.46 minutes a series of POSTs are made
Immonia (P2P) Analysis • Acquires peerlist by IRC • All C&C messages have TCP PSH flag set • Response times are within 200 ms • Service execution time is slow • Outgoing packets aren’t seen for 15-30 seconds • Periodic peerlist updates are not consistent
Immonia Service Execution • This packet sequence is used for the servexec command
ScopeBackgroundC&C AnalysisFeature Extraction and Classification EngineClassification ResultsConclusion
C&C Only Detection • All Botnets must have C&C communication • Relying on propagation activity makes detection of bots that are carried on mobile devices and removable media harder to detect • Scale-based solutions have to wait until a bot spreads, they cannot detect just one
Supervised Learning • Signature-based detection cannot reliably detect unknown botnets • Clustering has been shown to be effective for scale-based solutions, but not lone bot detection • Supervised learning has been shown to be effective for malware detection, and in other botnet detection solutions
Observable C&C Commands Presented by Masud et al. • Bot-Response: cause the bot to respond to the outside IP that sent the command • Bot-Other: cause the bot to send a packet to a different outside host • Bot-App: cause the bot to launch an application (requires host-based solution) Category defined in this thesis: • Bot-Propagate: cause the bot to initiate some sort of spreading behavior (not necessarily present)
Packet-Level Features • Bot-response (BR) • BR-time • BR-size • Bot-other (BO) • BO-time • Packet length • Packet arrival time • TCP Window • Push Flag
Flow-Level Features Average and Variance of: • packet length • BR-size • BR-time • BO-time • Inter-arrival time Percentage of packets in the flow that are: • BR packets • BO packets • TCP PSH packets And: • Bits per second • Packets per second
Useful Features Average and Variance of: • packet length • BR-size • BR-time • BO-time • Inter-arrival time Percentage of packets in the flow that are: • BR packets • BO packets • TCP PSH packets And: • Bits per second • Packets per second • Maximum TCP Window
Feature Analysis (Example) Bot-Other Time Bot-Response Time X: Average Y: Variance
Classification Classification Algorithms: • Bayes Net • Naïve Bayes • SMO – SVM with Sequential Minimal Optimization • J48 Decision Tree
ScopeBackgroundC&C AnalysisFeature Extraction and Classification EngineClassification ResultsConclusion
Implications of Results • Bayes Net was most effective on average • Not the case in all unknown botnet test • Zeus appears to be more difficult to detect • Lower detection rates in unknown tests • Immonia is very easy to detect • Probably due to its lack of evasion functionality and academic nature • This mechanism for Botnet detection is feasible, but needs more work
ScopeBackgroundC&C AnalysisFeature Extraction and Classification EngineClassification ResultsConclusion
Summary • Two contributions: • Analysis of three types botnet communication • Proof of concept for a network-based, C&C only detection mechanism
Future Work • Larger Data sets • Expand feature set • Classification algorithm optimization • Botnet network traffic data sets