Large scale IDS

1. Large scale IDS Network Intrusion Detection Deployment, Data Mining, and Management on a large scale

2. Who are we? Jeff Nathan jeff@wwti.com Contributing Snort Developer IDS Researcher Brian Caswell bmc@snort.org Snort Signature Maintainer Corporate IDS Team Leader

3. What are we discussing? No IDS is perfect Deployment concepts Sensor management Real-time IDS Data Management Data Mining Data Fusion Cost

4. No IDS is perfect Even ID systems have had problems Snort�s ICMP Payload printing issue BlackICE�s ICMP DoS/Kernel level overflow Dragon�s SNMP decoding DoS And we haven't started talking about detecting attacks yet� Snort was susceptible to DoS in sniffer mode due to an ICMP printing bug. BlackICE suffered from an exploitable condition in its handling of ICMP which lead to an exploitable condition. Dragon was susceptible to DoS due to a bug in SNMP decoding. Of all the major IDS products, only snort received press after the DoS was discovered.Snort was susceptible to DoS in sniffer mode due to an ICMP printing bug. BlackICE suffered from an exploitable condition in its handling of ICMP which lead to an exploitable condition. Dragon was susceptible to DoS due to a bug in SNMP decoding. Of all the major IDS products, only snort received press after the DoS was discovered.

5. ID Systems are still evolving Resolving the ambiguities of passive detection Drawbacks of using a single detection mechanism Inline technologies Scalability is still not proven Did the end host accept the packet? Virtually every IDS developer will tell you they use multiple ID technologies to prove their own technology and expand their coverage. Inline technologies are new and unexplored in large environments where network availability is the prime directive The scalability of existing systems into large environments has not been directly address by vendors.Did the end host accept the packet? Virtually every IDS developer will tell you they use multiple ID technologies to prove their own technology and expand their coverage. Inline technologies are new and unexplored in large environments where network availability is the prime directive The scalability of existing systems into large environments has not been directly address by vendors.

6. No love between the children Vendors have unique detection capabilities Some ID systems will not integrate with others at all Correlating events between different systems is difficult Signature detection vs. pure protocol decoding (a semantic issue) Proprietary management/alerting/logging mechanisms do not integrate well Even with compatible output, correlation between systems is largely unavailable.Signature detection vs. pure protocol decoding (a semantic issue) Proprietary management/alerting/logging mechanisms do not integrate well Even with compatible output, correlation between systems is largely unavailable.

7. Wait, what about CVE? Does not cover everything an IDS looks at porno-fantastico,GOL! Busca el lubrificante CIEL Project CVE Sub-Project for IDS mappings Descriptions of detected attacks vary between vendors CVE Compatability helps, but it isn�t a complete solution CIEL = Common Intrusion Event List Porno-fantastico, GOL! Busca el lubrificante = kickass porn, SCORE! Get the lotion! (snort classification for potential pornographic materials) CVE Compatability helps, but it isn�t a complete solution CIEL = Common Intrusion Event List Porno-fantastico, GOL! Busca el lubrificante = kickass porn, SCORE! Get the lotion! (snort classification for potential pornographic materials)

8. Pre-deployment discussion How many people will view the output? What are their skill levels? Where should we place our IDS? How do I train my analysts? How many people � that know IP forensics? Were the people that designed your network sane? If you don�t have people trained in IP forensics, how DO you train them?How many people � that know IP forensics? Were the people that designed your network sane? If you don�t have people trained in IP forensics, how DO you train them?

9. Deployment mechanics IDS Technologies �Gigabit� ID systems http://www.cs.um.edu.mt/~ssrg/Wallace.htm Considering multiple systems Managing a large number of sensors

10. Mechanics - Taps Good Ideal for monitoring critical pipes Fail open Nothing to manage Nothing to configure Bad Copper taps require high end switches Requires more rack space Cost Switches must capable of spanning to combine tap ports Switches must capable of spanning to combine tap ports

11. Mechanics - Switches Good Can be used inline Provides some degree of buffering Remotely managed High end switches can aggregate multiple tapped segments together Bad Fail Closed Insufficient back plane bandwidth really hurts Over subscription between mixed media and in overloaded switches

12. Mechanics - Spans Good Copy Ethernet frames from one physical port to another Can be used for both tap & switch-only deployments Can be modified by switch configuration (instead of moving cables) Bad Used for both tap & switch-only deployments Computationally expensive to the switch May deliver more data to a port than the media can handle

13. Mechanics - Load Balancing Good Allow for �practical� deployment into high-speed networks Easiest mechanism for deploying multiple sensors at the same location Bad Tap vendors don�t work with load balancer vendors Little practical documentation for enterprise environments Introduce a possible point of data mangling Limited port density Requires taps expensive

17. Mechanics - Sensor Management Number of solutions, most are very expensive Tivoli NSH Cfchange Rsync CVS

18. Mechanics � Sensor management with rsync Good Centrally managed Remote sensors can�t log in to the �master� Bad Difficult to scale push Each configuration requires a separate rsync directory Switches must capable of spanning to combine tap ports Switches must capable of spanning to combine tap ports

19. Mechanics � Sensor management with CVS Good Centrally managed Pull generally scales better than push Multiple configurations managed together Entire operating systems can be managed via CVS Bad Difficult to manage Abuses CVS Management of OS adds much more complexity Switches must capable of spanning to combine tap ports Switches must capable of spanning to combine tap ports

20. Real-time IDS doesn�t scale On a typical SDSL line: 5 alerts per minute 300 alerts per hour 7200 alerts per day On a typical T1: 50 alerts per minute 3000 alerts per hour 72000 alerts per day On a highly utilized DS3: 8 alerts per second 480 alerts per minute 28800 alerts per hour 691200 alerts per day

21. A non-scaleable approach If each alert takes 30 seconds to examine, you need 120 analysts that work around the clock When will they eat? When will they sleep? When will they use the bathroom?

22. Stuck on the non-scaleable? Better stock up on Red Bull and catheters for your SOC Look into purchasing stock in Red Bull GMBH

23. Data Management Data Format IDMEF Security MIBS Syslog Something that scales

24. Security MIBS Divides the alert space into different spaces: IP Layer Transport Layer Protocol Layer

25. Security MIBS - Example TCP SYN flood attack: tcpSYNFlood OBJECT Identifier ::= {iso 3.6.1.5.1.3.1.1} Sub-objects for additional information tcpSYNFlood.src OBJECT Identifier ::= {iso 3.6.1.5.1.3.1.1.1} tcpSYNFlood.dest OBJECT Identifier ::= {iso 3.6.1.5.1.3.1.1.1.2}

26. Security MIBS � good & bad Good ASN1 is widely supported Widely documented SNMP is a standard Bad ASN1 is difficult to implement Difficult to read SNMP is still immature SNMP v3 implementations are rare Protos anyone?Protos anyone?

27. CIDF � Common Intrusion Detection Framework Initial DARPA Research by Teresa Lunt and Stuart Staniford-Chen among others S-Expressions Actually use Generalized Intrusion Detection Objects (GIDO) Encoded version of an S-expression Work spurred on the Intrusion Detection Working Group (IDWG)

28. CIDF - Example (Delete (Context (HostName �first.example.com�) (Time �16:40:32 Jun 14 1998�) ) (Initiator (UserName �joe�) ) (Source (FileName �/etc/passwd�) ) )

29. CIDF � good & bad Good Very extensible in S-expression form Easily readable in S-expression form Bad Work stopped in �99 Not actually implemented anywhere Difficult to parse Not as efficient as other reporting formats

30. IDMEF � Intrusion Detection Message Exchange Format Primary usage Sensor to console Console to console Actual Implementations libidmef & Beep snort-idmef prelude stat (www.cs.ucsb.edu/~rsg/STAT/) Unanswered questions: Storage Viewing Data COTS implementations?

31. IDMEF - Example <IDMEF-Message version="0.3"> <Alert ident="abc123456789" impact="attempted-dos"> <Analyzer analyzerid="bc-sensor01"> <Node category="dns"> <name>sensor.bigcompany.com</name> </Node> </Analyzer> <CreateTime ntpstamp="0x12345678.0x98765432"> 2000-03-09T10:01:25.93464Z </CreateTime> <Source ident="a1a2" spoofed="yes"> <Node ident="a1a2-1"> <Address ident="a1a2-2� category="ipv4-addr"> <address>222.121.111.112</address>

32. IDMEF � Example (continued) </Address> </Node> </Source> <Target ident="b3b4"> <Node> <Address ident="b3b4-1" category="ipv4-addr"> <address>123.234.231.121</address> </Address> </Node> </Target> <Target ident="c5c6"> <Node ident="c5c6-1" category="nisplus"> <name>lollipop</name> </Node> </Target>

33. IDMEF � Example (still going�) <Target ident="d7d8"> <Node ident="d7d8-1"> <location>Cabinet B10</location> <name>Cisco.router.b10</name> </Node> </Target> <Classification origin="cve"> <name>CVE-1999-128</name> <url>http://www.cve.mitre.org/</url> </Classification> </Alert> </IDMEF-Message>

34. Syslog Bastard stepchild of IDS alert delivery Unreliable No guarantee of delivery ASCII only format

35. Syslog � good & bad Good Easy to parse Human readable Widely supported Already deployed in your infrastructure Bad Difficult to secure Unreliable No guarantee of delivery

36. Data Exchange � A practical approach Requirements: Portable Small Flexible Handles the data you need Readable by your end system Compressible Human readable Dragon uses MySQL Sourcefire uses an embeded DB ISS uses access (or MSSQL depending on the version) SNP uses MSSQL Dragon uses MySQL Sourcefire uses an embeded DB ISS uses access (or MSSQL depending on the version) SNP uses MSSQL

37. Data Exchange - CSV How about CSV? Natively supported by most of the ID systems In the format we need for our data warehouse anyway

38. Data Storage Star Schema: Single "fact table" Multiple decode tables Why should we use this schema? Maximum flexibility Low maintenance Best performance for the most needed information

40. Data Storage � Good & Bad

41. Data Mining Association Clustering Deviation Analysis Link or Tree abduction Neural Abduction Rule Abduction Statistical Analysis - Association - Analysis of cause-and-effect and structure of relationships between datasets. - Clustering - Segment data into subsets that share common properties - Deviation Analysis - Analyzes deviations from normal statistical behavior - Link or Tree abduction - discovers relationships between data sets and interesting connecting pattern properties. - Neural Abduction - training artificial neural networks to match data, extract node weights and structure (similar to abducted rule sets). - Rule Abduction - IF-THEN-ELSE rules that describe associations, structures and the test rules. - Statistical Analysis - determine the likelihood of characteristics and associations in selected data sets. - Association - Analysis of cause-and-effect and structure of relationships between datasets. - Clustering - Segment data into subsets that share common properties - Deviation Analysis - Analyzes deviations from normal statistical behavior - Link or Tree abduction - discovers relationships between data sets and interesting connecting pattern properties. - Neural Abduction - training artificial neural networks to match data, extract node weights and structure (similar to abducted rule sets). - Rule Abduction - IF-THEN-ELSE rules that describe associations, structures and the test rules. - Statistical Analysis - determine the likelihood of characteristics and associations in selected data sets.

42. Data Mining - Implementations Spade (Snort Preprocessor Plugin) Deviation Analysis Cyber wolf Semi real-time rule abduction

43. Spade Good Semi Real-time Distributed Computation Bad Limited scope Only looks at TCP SYN packets Anomalies happen

44. CyberWolf Semi Real-time Rule Abduction User defined rules that create incident trouble tickets Currently deployed at FEMA and AFRL Example: $A connects to $B on dstport 80 $A attacks $B with NIMDA if ($B attacks * with NIMDA) { generate_incident();} FEMA= Federal Emergency Management Agency (the shadow government agency) AFRL = Air Force Research LabsFEMA= Federal Emergency Management Agency (the shadow government agency) AFRL = Air Force Research Labs

45. Data Fusion Unified data view Enterprise wide view Plug & Play IDS Vulnerability Correlation

46. Alert Fusion RealSecure � HTTP_IE_BAT Snort � WEB-IIS .bat? access Apache � GET /args.bat?dir HTTP/1.0 If multiple alerts have generally the same time, with the same SRC, DST, SRCPORT, and DSTPORT, its probably the same thing

47. Ooh, Alert Fusion Good Provides integrity checking Sensor $A caught this, sensor $B didn�t. Why? Vendor $A caught this, sensor $B didn�t. Why? Implemented by ARIS and Tivoli Risk Manager Can anyone say CIEL?

48. Vulnerability Correlation 3:00PM - .ida buffer overflow attempt against IP A previous vulnerability scan says �may be vulnerable to .ida buffer overflow� foreach my $cve (%{$sigs{$event}}) { if ($vulns{$dstip}{$cve} || $vulns{$srcip}{$cve}) { $priority++; } } Implemented by: Enterasys provides addon for Nessus Correlation ISS's SiteProtector Security Fusion Module

49. Wait, what about ip360? Get an alert? Scan to see if you are vulnerable Not scalable Scan your network, only look for things you are vulnerable to Don�t you want to know if you are being attacked, even if you are not vulnerable? Scan your network, change priorities of alerts if you are vulnerable Don�t ignore data because your scanner told you so, but raise the priority if you need to

50. Cost IDS is expensive Providing visibility into large networks requires a well implemented system (with lots of expensive hardware) Post processing of alert data and data mining techniques require commercial databases Large networks require many more sensors It costs money to protect money A poorly implemented solution adds little to the overall security

51. Conclusion Prioritization of alert data is critical Effectively deploying IDS is complicated Effectively deploying IDS on a large scale is much more complicated Integrating multiple vendor�s products will remain difficult until CIEL takes hold and we (the users) push the vendors to add support

52. Concluding the Conclusion Effectively managing IDS output requires trained analysts Dynamic reprioritization of alert data before its pasted to the alerting mechanism is important Vendors need to investigate data mining mechanisms for post processing of alert information Large scale deployments of various ID systems requires an incredible amount of work

Large scale IDS

Large scale IDS

Presentation Transcript

Large Scale Weather

Large Scale Structure

LARGE - SCALE ASSESSMENTS

large scale Refactoring

Large scale control

Large-scale matching

LARGE SCALE

4. Large-Scale Forcing Datasets Large-scale forcings are obtained:

Large- scale Organisations

LARGE SCALE ORGANISATIONS

Large scale

Large scale MC production

Large-Scale Systems

Large Scale Organisations.

Large Scale Sharing

Large Scale Sharing

Large Scale Projects

Large Scale Operations

Large Scale Applications

How large-scale barcoding promotes large-scale biodiversity assessment

Large Scale Pilot

Large Scale Drupal