1 / 42

Model-based Intrusion Detection for SCADA Networks

Model-based Intrusion Detection for SCADA Networks. Steven Cheung, Bruno Dutertre, Martin Fong, Ulf Lindqvist, Keith Skinner, Alfonso Valdes (valdes@csl.sri.com).

palmer
Download Presentation

Model-based Intrusion Detection for SCADA Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-based Intrusion Detection for SCADA Networks Steven Cheung, Bruno Dutertre, Martin Fong, Ulf Lindqvist, Keith Skinner, Alfonso Valdes (valdes@csl.sri.com) This work was produced in part with support from the Institute for Information Infrastructure Protection (I3P) research program. The I3P is managed by Dartmouth College, and supported under Award number 2003-TK-TX-0003 from the U.S. Department of Homeland Security, Science and Technology Directorate. Points of view in this document are those of the author(s) and do not necessarily represent the official position of the U.S. Department of Homeland Security, the Science and Technology Directorate, the I3P, or Dartmouth College.

  2. Presentation Outline • Background • SRI Overview • The I3P SCADA Project • Intrusion detection approaches • IDS in PCS • Defense enabled architecture • Model based detection • Detect deviations from Modbus spec • Detect invalid communication patterns • Detect changes in service usage patterns • Detector based on formal model • Conclusion

  3. Who we areSRI is a world-leading independent R&D organization • Founded by Stanford University in 1946 • A nonprofit corporation • Independent in 1970; changed name fromStanford Research Institute to SRI International in 1977 • Sarnoff Corporation acquired in 1987 (formerly RCA Laboratories) • 2,000 staff members combined • 800 with advanced degrees • More than 15 offices worldwide • Consolidated 2005 revenue: $390 million SRI headquarters, Menlo Park, CA Sarnoff Corporation, Princeton, NJ • Sarnoff India • SRI Taiwan SRI – State College, PA SRI – Tokyo, Japan SRI – Washington, D.C.

  4. What we doWe create solutions that address your needs • Customer-sponsored R&D From discovery, study, and evaluationto custom solutions on demand • Licenses Innovative technologies ready for use • Ventures Spin-off companies to capitalize on new opportunities • Partnership programs Value creation programs to maximize your success

  5. Advanced Materials, Microsystems, and Nanotechnology Health, Education, and Economic Policy SRI’s ValueCreationProcess™ Biotechnology Information Technology Our focus areasMultidisciplinary teams leverage developments from SRI’s core technology and research areas Engineering and Systems

  6. ComputingSRI invented the foundations of personal computing 1964–1968:SRI’s Doug Engelbart and team invented the computer mouse and demonstrated the foundations of personal computing. Today:SRI leads development of CALO, the Cognitive Assistant that Learns and Organizes, to revolutionize how computers support decision makers. President Bill Clinton presents Doug Engelbart with the 2000 National Medal of Technology

  7. Intelligent roboticsSRI has pioneered robotics for 40 years 1966–1972: SRI’s Shakey was the first mobile robot capable of reasoning about its actions. Today:SRI’s Centibots, one of the first and largest teams of mobile coordinated robots, can explore, map, and survey unknown environments. Elected to the Robot Hall of Fame in 2004

  8. Internet and networksSRI was there “before the beginning” 1969: SRI received (from UCLA) the first logon to the ARPANET, predecessor of the Internet. 1970–1992: SRI ran the Network Information Center (NIC), the domain name registration clearinghouse for all Internet computer hosts connecting to the ARPANET and Internet. SRI assigned all .com, .org, and .gov domain names. 1987: SRI’s pioneering network intrusion detection technology protects against malicious attacks. .com .gov .org Today: SRI administers the Cyber Security R&D Center for the Department of Homeland Security. The Center develops security technology for protection of the U.S. cyber infrastructure through partnerships between government and private industry, the venture capital community, and the research community.

  9. The Critical Infrastructure of the United States

  10. What is the I3P? • The Institute for Information Infrastructure Protection, funded by Congress, managed by Dartmouth College with oversight from DHS – www.thei3p.org • Established in 2001 to identify and address critical research problems facing our nation’s information infrastructure • Consortium of over 25 universities, non-profit research institutions, and federal labs

  11. What is this Research Project? • Two-year applied research effort to improve cyber security for control systems/SCADA • Help industry better manage risk by • providing risk characterization • developing and demonstrating new cyber security tools and technologies • enhancing sustainable security practices for control systems

  12. Why is this Project Important? Control systems are critically important to the safe and efficient operation of infrastructure systems but are vulnerable to cyber attacks: • Control systems security problems and remediation approaches are different from IT • Effects of cyber attacks on operations and interdependent infrastructures not well understood

  13. Project Goals • Demonstrated improved cyber security in the oil and gas infrastructure sector • New research findings • New technologies • Significantly increased awareness of • Security challenges and solutions • The capabilities of the I3P and its members

  14. Intrusion Detection Approaches • Signature: Match traffic to a known pattern of misuse • Stateless: String matching, single packet • Stateful: Varying degrees of protocol and session reconstruction • Good systems are very specific and accurate • Typically does not generalize to new attacks • Anomaly: Alert when something “extremely unusual” is observed • Learning based, sometimes statistical profiling • In practice, not used much because of false alarms • Learning systems are also subject to concept drift

  15. Intrusion Detection Approaches (2) • Probabilistic (Statistical, Bayes): A middle ground, with probabilistically encoded models of misuse • Some potential to generalize • Specification based (some group this with anomaly detection): Alert when observed behavior is outside of a specification • High potential for generalization and leverage against new attacks

  16. Our Hypothesis • By comparison to enterprise systems, control systems exhibit comparatively constrained behavior: • Fixed topology • Regular communication patterns • Limited number of protocols • Simpler protocols • As such, specification- and model-based IDS approaches may be more feasible • Such an approach nicely complements a signature system • Benefits are a compact, inherently generalized knowledge base and potential to detect zero day attacks

  17. Library patterns New Obs P1 DE X A D X P2 A D Pattern Anomaly Detection • Binary patterns • Fixed length: TCP flags • Variable length • Patterns of categorical-valued features • (Counts of) system calls • Ports • Observation matches P1 in D and X, P2 in A and D, but X has a low hit count • => P2 is a better match • Observation is assigned the label of P2 • Depending on whether P2 is rare or previously labeled malicious, generate an alert • New P2 has a little “X” New P2 A D X

  18. Bayes Net Algorithms • Describe the world in terms of conditional probabilities • Model observables as nodes in a directed graph • Children get p (prior) messages from parents • Parents get l (likelihood) messages from children • At leaf nodes, l messages correspond to observations • Belief state is updated as new evidence is observed p(A) l(A) A l(D) p(B) l(C) p(C) p(D) This diagram illustrates message propagation in a tree fragment l(B) B C D p(X) l(Y) p(Y) l(X) X Y

  19. Learning, adaptation • Bayes models have a network structure and node parameters • Conditional probability tables, or CPT • CPT(i,j)=P(child state = j | parent state = i) • We did not try to learn structure • CPT’s can be learned off-line or adaptively • For real world data, no ground truth. • We observed “hypothesis capture” on very long runs • eBayes has optional capability to generate new hypotheses if no existing ones fit • Stability of learning and hypothesis generation are still research issues for us

  20. Mail FTP DICT Mail FTP DICT New Observations Transition Function Bayes Inference Model Transition and Update • New sessions start with a default prior over normal and attack hypotheses • Inference results in new belief • “In progress” alerts may be generated • This passes through a temporal transition model • Tends to decay back to normal • But once a session is sufficiently suspicious, it will be reported • New inference updates belief

  21. Approaches Provide Complementary Protection

  22. Models and Detection Approaches • Signature and probabilistic IDS model misuse • Anomaly approach empirically models “normal” system usage and behavior • Specification-based approach models what is allowable under the protocol specification • Also models “normal”, but in a different sense from what is typically meant in anomaly detection • Drawbacks of specification-based models: • For general enterprise systems, constructing models is expensive and difficult (system complexity, complexity of user activity) • Inaccurate models can lead to false alarms and/or missed detections

  23. IDS In PCS • Barrier defenses (switches, firewalls, network segmentation) are essential, but • An orthogonal view is essential to detect when these have been bypassed or penetrated • One detection approach may not alert on a critical exploit • Correlation of related events is essential to provide the operator coherent situational awareness

  24. EMERALD IDS for PCS • Multi-algorithm IDS appliance • Pattern Anomaly • Bayes analysis of TCP headers • Stateful protocol eXperts • Complemented by custom ruleset SNORT • Alerts (potentially from multiple IDS appliances) forwarded to correlation framework • PCS Enhancements • Digital Bond PCS rule set • Model Based Detection

  25. Models for Characterizing Acceptable Behavior • Protocol level: based on MODBUS protocol spec, for single field and dependent fields • Network access patterns, based on analysis of topology configuration • Service usage patterns, based on learned valid MODBUS function codes for monitored devices

  26. Protocol Model: Individual fields • MODBUS function codes are one byte • 256 possible values, but • MSB is used by servers to indicate exception • 0 is not valid, so valid range in 1-127 • Range is partitioned into public, user-defined, and reserved • With no further knowledge, can construct a “weak specification” • Many actual devices support a much more limited set of codes • Permits definition of a stronger, more tailored specification

  27. Protocol Model: Dependent Fields • Encode acceptable values of a field given the value of another field • Example dependent fields include length, subfunction codes, and arguments • For example, “read coils” function implies the length field is 6 • For other function codes, length varies but a range can be specified • Specifications for multiple ADUs: future work

  28. Detecting Unusual Communication Patterns • Specification of network access policies • Comm between Admin LAN and PCS LAN is restricted to that between Admin historian and PCS historian • PCS Master may communicate with Modbus PLC using Modbus-TCP • PCS historian may communicate with PCS Master • Domain controller may provide services to other hosts in the PCS LAN • Detection of exceptions is via SNORT rules • More complex networks (more devices) can be accommodated via IP address assignment with appropriate subnet masks

  29. Detecting Changes in Server/Service Availability • EMERALD Bayes component includes TCP service monitoring • New service discovery (suspicious in a “stable” system) • Service up/down/distress • Modifies probability models and makes the component more accurate • EMERALD SCADA includes analogous capability for MODBUS function codes • Alerts when a device responds to a new function code (MODBUS service discovery) • Alerts when a function code previously considered valid for a device results in error replies

  30. Complete Formal Model in PVS • PVS: Prototype Verification System • Expressive specification language (higher order logic) + powerful theorem prover • Other tools available in PVS: • Model checker • Compiler and execution environment for a subset of the PVS language • Model-based IDS in PVS: • Full specification of Modbus protocol in PVS • Customizable to the actual system (e.g., which functions/addresses are used). More complete and precise than SNORT-based model

  31. From PVS model to IDS • PVS Model: • Specifies correct Modbus requests and valid responses to requests • Defined by two PVS predicates with signature acceptable_request: [packet  bool] valid_response: [request, packet  bool] • These predicates are in the executable fragment of the PVS language • IDS: use the model online • Compile the predicates into executable code (uses the PVS compilation/evaluation tools) • Check for violations are runtime: intercept requests/responses and evaluate the predicates.

  32. PCS-Enabled NIDS/Mcorr Appliance • Alerts and Diagnostics • SHARP (PNNL) • SecSS (Tulsa) • APT (UIUC) Testbed Architecture

  33. Experimental Scenario (1) • Internet attacker achieves privileged access to the corporate network (Admin PC). • The attacker downloads hack tools to the compromised corporate network host, and sets up a tftp server for his tools. • The attacker scans the network (Admin LAN) and discovers the Admin Historian on the corporate network. • The attacker achieves elevated access to the Admin Historian, and learns of a data relationship to a PCS Historian on the other side of a firewall. The Admin Historian is subsequently pushed off the network, and the Admin PC assumes its IP. • The attacker scans the PCS Historian from the Admin PC.

  34. Scenario (2) • The attacker discovers a vulnerable authentication service on the PCS Historian host and visible because of a bad firewall configuration on the PCS FW. It is subsequently exploited to connect with system privilege to the PCS Historian. • The attacker downloads a "rogue master" and other tools to the compromised PCS Historian via tftp from the Admin PC. The PCS Historian now serves as the launching point for subsequent attacks, directed from the Admin PC. • The attacker scans the PCS network (PCS LAN) and discovers a vulnerable PCS Master box. • The attacker launches an attack to take down the PCS Master. • The attacker initiates a Modbus device scan on the PCS LAN and discovers the PLC. Subsequently, a Modbus command is sent to close a contact to the PLC; a light/indicator illuminates

  35. Detections • Scans: Bayes sensor, unusual comms • Aggregation presents thousands of probes as single alert • Compromise exploits (UPNP): SNORT Bleeding Edge • tftp: Unusual Comms • MODBUS Exploits: New modbus services, spec based detection,Digital Bond set

  36. Alert Summary

  37. Summary • Barrier protections are essential in PCS • DMZ • Switches, firewalls, VPN • IDS is an important orthogonal defense • Model based approach using protocol specs is a feasible complement to signature IDS in control systems • Multi-component, multi-approach detection provides complementary views of an attack • Alert correlation presents actionable situational awareness picture

  38. I3P Houston Workshop • Workshop will provide: • Overview of threats to PCS • Demonstration of vulnerabilities in PCS • Technology demonstration • Training in risk management, security tools, and mitigation strategies • Opportunity for dialog with industry leaders • Sheraton Brookhollow Houston • February 15-16 • www.thei3p.org

  39. Backup

  40. Similarity Function • Generalizes N(Intersection)/N(Union) • “Intersection” is the sum of the min probabilities where the patterns intersect • “Union” is the maximal probability where either pattern is non-zero

  41. Picking the Winner • Library patterns “compete” for new pattern • Winner is most similar as long as similarity is over a set threshold • Winner is slightly modified to include a little of the new pattern.

  42. Determining “Rare” • If large number of patterns is learned, many may be rare • Alert on tail probability • Technique does not work for large number of patterns, but tail prob approach does no harm

More Related