440 likes | 655 Views
Trustworthy Smart Grid Infrastructures Threats, Challenges, and Countermeasures. Saman Zonouz University of Miami. Cyber Security and Forensics Research Group. www.4-n-6.org Born in Jan 2012 Research Real-time Smart-Grid Situational Awareness and Intrusion Tolerance
E N D
Trustworthy Smart Grid Infrastructures Threats, Challenges, and Countermeasures Saman Zonouz University of Miami
Cyber Security and Forensics Research Group • www.4-n-6.org • Born in Jan 2012 • Research • Real-time Smart-Grid Situational Awareness and Intrusion Tolerance • Cyber-Physical Power Grid Malware Analysis and Intrusion Detection • Host-based Intrusion Diagnosis and Root-Cause Forensics Analysis • Collaborators • UIUC, PSU, WSU, Purdue, FIU, EPFL, Google, AT&T Research, IBM Research, Qualcomm Research • Sponsors ~$3M • NSF, ONR, ARPA-E, DOE, Fortinet, UM
Outline • Smart Grid Infrastructures • Potential Threats • Countermeasures • Conclusions
The Electric Grid Structure inter-connected regional transmission network operators edge (distribution) networks Power* sources transmission network (backbone) distribution network (edge) distributed generation (DG) power consumers *Jim Kurose, Networking Challenges for the Smart Grid, IIT Mumbai, 2013.
Traditional Power GridGeneration, Transmission, and Distribution* *PowerWorld Simulator.
Background: Smart-Grid • Definition • Efficiently, reliably, flexibly and sustainably monitor and control the generation, distribution and use of electricity • Functional characteristics [DOE] • Self-healing from power disturbance events • Enabling active participation by consumers in demand response • Operating resiliently against physical and cyber attacks • Enabling new products, services, and markets • Optimizing assets and operating efficiently • Intelligent distributed measurement and control in smart grid introduces new security attack surfaces for complex cyber-physical intrusions! • More accessibility from remote sites • More remote system observabilityand controllability • Harder to detect, analyze, and respond to attacks!
Power Control Network* *K. Stouffer, J. Falco, K. Scanfone, Guide to Industrial Control Systems Security, NIST.
Power System Monitoring and Control • Observation/sensing • Current transformers, voltage transformers, PMUs, etc. • Measurement noise, incomplete sensors state estimation • Inputs: system topology, generation output, load • Output: state vector (voltage phasors) • Control (e.g., by operators/HMI servers) • Relays, generation set points, PLC controllers
Cyber-Physical System Security • Integrated cyber and physical components • Potentially more catastrophic security incidents • Security objectives: AIC • Particularly in safety-critical infrastructures • Control network security solutions • Intrusion detection sensors • Access control policy enforcement • Incident auditing, logging, and analysis Power Control Network Cyber attack could blow up a generator Targeting nuclear plants [Symantec]
Cyber-Physical Threat Attack Surfaces Measurements Control Center Power Applications Actuators/Apps/ Operators . . .
Lack of security auditing solutions (traceability) Log-Free Instantaneous Post-Attack Forensics Analysis
Problem Formulation • Goal: Once an attack occurs, to determine how it happened? • Detection point: a malicious file modification • e.g., password file change by Firefox • Existing solution • heavy auditing, and logging individual incidents • Problem: Large trusted zone req. for log storage, parsing and analysis • High runtime/logging performance overhead • Storage requirements • Slow log parsing and analysis (not realtime) • Challenge • A system/TCB that provides the root-cause analysis capability by design?
Solution • Design/modify access control policies to facilitate post-intrusion root-cause analysis • Log-Free and Instantaneousroot-cause analysis • Case study: automated modification of the SE-Linux modular targeted policy base Access Control Policies allow A read socket; allow D write sensitive_file; B Rewritten Access Policy D duplicate Original SE-Linux Policy D A DB B Detection point: sensitive file modification by a process in domain D Question? ABD ORACD transition C A D duplicate domain DC C
False sensor measurement data injection Cyber-Physical Security State Estimation in Power Grid
False Data Injection to Mislead the State Estimation Server 1.02 pu1.34° 1.03 pu2.44° Attack design: Specifically chosen “interacting” measurements to satisfy the power flow equations All states at non-maliciousbuses arepreserved! 1.03 pu5.14° 1.03 pu3.79° -1 MW34 MVAr 1.03 pu9.35° 1.07 pu-1.297° The reality 1.03 pu-2.22° 90 MW-70 MVAr |V| (pu) 1.03 pu-2.22° Values θ (deg) 0 MW64 MVAr 1.04 pu0.00° P load (MW) Q load (MVAr) Q: identify corrupted sensor data?
Current Bad Data Detection Solutions: Residual-Based Approaches • Need to account for possibility of bad data • Bad data definition*: “measurements that are grossly in error” • Bad data can potentially result in incorrect power-state estimates • Measurement residuals – typical bad data detection for state estimation if ||z −Hx|| ≤ τ no bad measurements • Coordinated attacks modify “interacting bad-measurements” that satisfy the power flow solution equations • difficult or impossible to detect using conventional means • * A. Monticelli, State estimation in electric power systems: a generalized approach. Kluwer Academic Publishers, 1999.
Residual-based Approaches: Insufficient against Security Compromises • Multiple interacting bad-measurements* • Case 1 • Case 2 (*) A. Monticelli, F. F. Wu, M. Yen, “Multiple bad data identification for state estimation by combinatorial optimization,” IEEE Transactions on Power Delivery, 1986.
Cyber-Physical State Estimation* • Co-utilize information from cyber and power network • to (more precisely) determine the state of the cyber-physical system • Use combined security state to provide a scalable approach to detecting bad data caused by a cyber event *S. A. Zonouz, K. M. Rogers, R. Berthier, R. B. Bobba, W. H. Sanders, T. J. Overbye, “CPSE: Security-Oriented Cyber-Physical State Estimationfor Power-Grid Critical Infrastructures,” in review for IEEE Transactions on Smart Grid.
Terminology • System state notion • privilege domains • What the attacker can do (proactive response) • past consequences • What the attacker has done (needs recovery) • Example: state s Attacker’s Privilege(s) Past Consequences state si Root(A), User(B) Opened(R1)
System Model Generation • Competitive Markov decision process (CMDP) • Stochastic attack graph/finite state machine • Network Access Policy • e.g., firewall rules • Network connectivity matrix and CMDP • automatically generated A B Relay: R1 R1 Rmt. A B Rmt. A Network Connectivity Matrix: B R1
IDSes: CMDP sensors Ø --------- Ø IDS-1 IDS-3 IDS-1 IDS-2 B --------- Ø A --------- Ø IDS-3 A IDS-2 B IDS-4 IDS-4 A,B --------- Ø IDS-2 B --------- R1.O Relay: R1 IDS-4 A,B --------- R1.O
Algorithm Step 1Potentially-bad Data Identification CMDP • ID sensor reports • attacker’s current privileges (probabilistically) • Cyber-Physical interconnection • the measurements that might have been modified by the adversary • Example • CP interconnection • i-th measurement (by PMUi): real power of the bus B2 • IDS alerts • PMUi is compromised i-th measurement might have been corrupted!
Algorithm Step 2 Power State Estimation & Verification • Throw the potentially-bad data away, and run a power state estimation using the remaining power measurements • Compute , and identify the corrupted measurements • based on how much they differ from their estimates
Cyber-Physical State Estimation Benefits • Improved bad-data detection • Accuracy and scalability • Improved state estimates
Dynamic topology reconfigurations Security-Oriented Cyber-Physical Contingency Analysis
Contingency Analysis • Answers the question: “What happens if component X goes out of service?” • Every 2-5 min to determine potential problems • Runs on the current state estimate • The list of contingencies must be picked carefully • The “N-1” criteria is used to operate the system • no violations when any one element is taken offline • Future requirements are strengthening the security criteria • (“N-1-1”): many contingencies need to be solved* • Example: For 1000 lines • N-1 means solving 1000 line outages • N-2 means solving 499500 line outages (1000 choose 2) *Charles Davis, Thomas Overbye: Linear Analysis of Multiple Outage Interaction. HICSS 2009: 1-8
Power System Contingency Analysis List of contingencies Violation summary What happens during contingency Violations caused by contingency
Security-Oriented Contingency Analysis • “What happens when X goes out of service?” • X could be either a critical power or cyber asset • Unlike traditional scenarios, the root-cause is cyber intrusions • Solution • To analyze the cyber network topology model (CMDP) • To measure how close the current security state is to serious power component contingencies • Benefit • Search space is significantly smaller • (limited by the cyber topology and strict access control policies)
Slow response mechanisms against attacks Game-Theoretic Intrusion Response and Recovery
Problem Formulation • Given • Power grid model (CMDP) • Contingency analysis results • Goal • Trustworthy (semi-)automated intrusion tolerance • Recover from past damages • Takes proactive response actions (to avoid further damages)
Trusted Intrusion Tolerance Engine Monitor 1 Action 1 . . . . . . Action m Monitor n Commands AlertsMonitoring info Intrusion Response System
Good vs. Bad • The (RRE vs. Attacker) battle • modeled as a two-player game • Stackelberg game • Sequential game scheme • Leader and Follower • s Leader takes action s’ Follower s’’ • Leader’s Goal • minimize the maximum possible damage by the follower! RREtakes the action which minimizes the maximum damagebytheattackerlater
State-based Modeling • Based on Competitive Markov Decision Processes • Framework for decision making • CMDP (S, A, R, Pr, γ) • S: state space • A: action space • R(s,a): reward function • Pr(s,a,s’): transition PDF • γ: discounting factor • two (conflicting) players • RRE and the attacker sm sj Adversarial actions Responsive actions si … … sn sk
Optimal Response Strategy Selection Blue states: secure. Red states: insecure.
Optimal Response Strategy • RRE solves the CMDP • Bellman’s equation • Stackelberg game • Value iteration
Acknowledgements We appreciate • Our collaborators • Robin Berthier, Stephen McLaughlin, Devin Pohly, HimanshuKhurana, Tim Yardley, William Sanders, RakeshBobba, Matt Davis, Kate Davis, Kaustubh Joshi, HariRamasamy, AnuragSrivastava. • Our sponsors • ONR, NSF, DOE, DHS, IBM, ARPA-E, UM, Fortinet.
Conclusions • Cyber-Physical Security State Estimation • Security-Oriented CP Contingency Analysis • Trusted Intrusion Tolerance Thanks for your Attention! Questions? s.zonouz@miami.edu