290 likes | 524 Views
Dynamic Application-Layer Protocol Analysis. For Network Intrusion Detection Holger Dreger, TU M un chen Anja Feldmann, T-Labs / TU Berlin Michael Mai, TU M un chen Vern Paxson, ICSI / LBNL Robin Sommer, ICSI Presented by: Jim Spadaro. NIDS: State-of-the-Art.
E N D
Dynamic Application-Layer Protocol Analysis For Network Intrusion Detection Holger Dreger, TU Munchen Anja Feldmann, T-Labs / TU Berlin Michael Mai, TU Munchen Vern Paxson, ICSI / LBNL Robin Sommer, ICSI Presented by: Jim Spadaro
NIDS: State-of-the-Art • Protocol-specific traffic analysis • Semantic context for (much) better detection quality • How to decide which protocol to analyze? • Relies on well-known port numbers • (As in, HTTP if-and-only-if TCP port 80) • (or um maybe 8080 and 8000 and ….) • And if it’s not on a well-known port? • Perhaps use byte-level signatures to flag what protocol it appears to be
Problem • Applications use arbitrary ports! • Benign reasons • Lack of user privileges, obfuscation, multiple versions • Adversarial applications (maybe not so benign) • e.g. Skype bypassing firewalls • Malicious intent • Evasion of security monitoring • IRC-botnets on ports other than 666x/tcp • Pirate FTP-servers on ports other than 21/tcp • How to distinguish these?
Structure • Prevalence of the problem • Approach for dynamic analysis in NIDS • Applications of new capabilities • Performance evaluation
Prevalence of the Problem • Data • 24 hour full packet trace from MWN • 3.2 TB of data in 6.3 billion pkts, 137M TCP connections • Successful TCP connections: ~78% • Successful TCP connections on unpriv. Port: ~4% • UCB: University of California, Berkeley, 45,000 • MWN: Munich Scientific Network, 50,000 • LBNL: Lawrence Berkeley National Laboratory, 13,000
Existing NIDS Solutions • None known to fully address the problem • Bro, Snort, Dragon, and Intrushield all rely on port-based protocol analysis • Some can use signatures to detect inappropriate protocol use • Such detection is helpful, but has drawbacks • Does not distinguish benign off-port traffic from malicious: • Can only stop BitTorrent completely, not detect for illegal file sharing • Can only turn off off-port IRC completely, not detect botnets
Protocol Detection - Alternatives • Statistical approach • E.g., packet size distribution • Suitable for separating interactive/bulk traffic • e.g., distinguish chat from file transfers • Detect protocol patterns • Signatures (already implemented) • Relatively easy to implement: most NIDS have signature-matching infrastructure • e.g., Linux netfilter l7-filter • Very general signatures, not completely accurate • Maybe: Protocol detection by plausibility heuristics
Protocol Detection: Signatures • Most (but not all) successful connections trigger expected signature • FTP: high percentage of false negatives ~ 21.7% • “Other port” matches: needs further investigation
Protocol Signatures:Well-known Ports • Some connections trigger more than one signature • Signature too general • Some misappropriate use of well-known ports
Observations • Imprecision of signatures: • False negatives highlight need for refined signatures and/or more context • False positives (e.g., multiple matches for single connection) highlight limits in discriminating power • Certain protocols are difficult to make signatures for • Telnet: many legitimate initial byte patterns • Problem is real: • If we just believe port numbers, numerous misidentifications
Structure • Prevalence of the problem • Approach for dynamic analysis in NIDS • Applications of new capabilities • Performance evaluation
Goals • Detection Scheme Independent • Currently predominantly use signatures • However, flexibility is maintained to allow other approaches, like heuristics • Dynamic Analysis • Some protocol detection schemes need more data than others • Analyzers should be disabled upon detecting a false positive • Modularity • Eases dealing with multiple network substacks • IP-within-IP tunnels • Efficiency • Improvements must retain performance • Customizability • Result must easily adapt to specific needs
Approach for Dynamic Analysis • Dynamic data path enhances flexibility and accuracy • Example: A packet is received on port 80/tcp, but really carries data for an IRC session • A traditional NIDS will still examine the packet as HTTP • Dynamic analysis can change the analysis to IRC even though the analysis was initialized for HTTP • Approach uses a PIA • Protocol Identification Analyzer
Dynamic Data Path • How can this be done? • Associate each connection with a tree structure • Each node represents an analyzer • Links represent data channels, with parent node’s output channels connecting to childrens’ input channels • The PIA instantiates the initial analyzers • Each analyzer can insert or remove other analyzers on its input and output channels • Thus, each analyzer can add additional analyzers if it needs the support of additional functionality • If the analyzer cannot determine which analyzer is needed, another PIA can be instantiated • An analyzer that cannot analyze the data it is being given can remove its subtree from the tree • Allows siblings on the tree to be run in parallel
Analyzer Tree Example • Example for an analyzer tree for an email connection: • The IP Analyzer determines the connection is TCP • The TCP Analyzer determines the connection looks like email • Analyzers for SMTP, POP, and IMAP are instantiated to analyze the data • Any analyzers that determine that they cannot analyze the data can remove themselves
Technical Issues • Byte Streams vs Packet Streams • Protocols over TCP vs Other • Resolved by having both input channels for each analyzer • Starting an analyzer mid-connection • Resolved by buffering the start of each stream (Default 4KB)
Implementation • Implemented in Bro NIDS • New “Protocol Identification Analyzer” (PIA) implements protocol-detection and buffering • Stock Bro has modular design suited to implementing the PIA • Required changing Bro’s notion of one-to-one static binding from transport analyzer to application analyzer(s) • Running in three large environments: • MWN, UCB, and LBNL
Implementation • PIA examines the first few KB of each connection for efficiency • Shown to be sufficient for protocol detection • Can activate analyzers in four ways: • Signatures • Connection port • Each analyzer can register a detection function • Allows arbitrary heuristics • Using a prediction table
Deployment Trade-Offs • Protocol detection signatures • Loose signatures affordable • false positives fixed later • But too lose means slower • Analyzer is more expensive than pattern-matching • Improve accuracy with bidirectional signatures • Server must respond with the same protocol • Prevents attacker from intentionally triggering slow analyzers
Deployment Trade-Offs • At what point should an analyzer remove itself? • Real-world traffic is not perfect • Implementations can stretch protocol bounds • Should not parse the whole stream • Defeats the purpose of protocol analysis • Resolution: Analyzer should never disable itself • Generate Bro events on protocol violations • Allow user-level policy script to disable analyzer if necessary • E.g., after a certain number of violations
Structure • Prevalence of the problem • Approach for dynamic analysis in NIDS • Applications of new capabilities • Performance evaluation
New Capabilities • In summary, can now: • Detect connections on non-standard ports reliably • Includes protocols that use others as transport • IE, distinguish Kazaa, BitTorrent, SOAP, etc over HTTP • Inspect payload of FTP transfers • Detect IRC-based bots • This has successfully worked in the field
Reliable Real-Time Protocol Detection on non-Standard Ports • 1 day at UC Berkeley (MWN similar) • Connections on non-standard ports mainly HTTP • UCB: Split between real HTTP (e.g., Apache) and Gnutella • MWN: Similar, but more P2P (BitTorrent), also some FTP • Open HTTP proxies detected and closed • Open SMTP relay detected and closed
Payload Inspection of FTP Data Transfers • FTP data transfers use arbitrary ports • Identify based on prior PORT, PASV • Dynamically added to prediction table • Check connection payload use libmagic • Actual file type == expected file type? • E.g, could find rootkit tarball sent in .jpg • Determined using file analyzer • Extension: • Use same mechanism for SMTP (mail attachments)
Detecting IRC Based Botnets • Idea • Botnet communication often uses IRC • Botnet detector on top of IRC analyzer • Check nicknames • Check channel names • Check contact to identified bot-servers • Key consideration: must analyze IRC dialog seen off-port • Because lots of benign IRC runs off-port too… • > 100 bots found at MWN+UCB • MWN employs auto-blocking based on detector • Not as adept at detecting custom protocols
Performance • New framework does not add significant additional overhead • Performance cost is about 13.8% between PIA-Bro-M4K and Stock-Bro • Protocol detection (signature matching on all packets) expensive but doable) • Solutions: • Specialized hardware • Load balancing possible
Summary • Network traffic resists classification by port • General framework for dynamic protocol analysis • Use signatures to pre-filter for efficiency • Use application parsing to make high-quality decisions • Accurate enough for auto-blocking of bots at large-scale network • Plus detection of illicit relays and servers • Integrated into Development Release 1.2 of Bro