300 likes | 501 Views
Survivability Analysis of Networked Systems. Jeannette M. Wing with Somesh Jha (Wisconsin) and Oleg Sheyner DARPA co-PI: Tom Longstaff (SEI). Computer Science Department Carnegie Mellon University Pittsburgh, PA DARPA OASIS PI Meeting, Norfolk, VA 14 February 2001. Survivability. What if
E N D
Survivability Analysis of Networked Systems Jeannette M. Wing with Somesh Jha (Wisconsin) and Oleg Sheyner DARPA co-PI: Tom Longstaff (SEI) Computer Science Department Carnegie Mellon University Pittsburgh, PA DARPA OASIS PI Meeting, Norfolk, VA 14 February 2001
Survivability • What if • a terrorist hacker brings down the nation’s power grid? • an act of Mother Nature causes the US banking network to fail? • Critical infrastructures • Utilities: gas, electricity, nuclear, water, … • Communications: telephone, networks, … • Transportation: airlines, railways, highways, … • Medical: emergency services, hospitals, … • Financial: banking, trading, … Jeannette M. Wing
Survivability • A system is survivable if it can continue to provide end services despite the presence of faults. • Faults • Accidental or malicious • Not necessarily independent • Finer-grained reliability analysis is required. • Service-oriented • Exploit semantics of application • Not all network nodes and links are treated equally. Jeannette M. Wing
Foundational Questions • What is the difference between models for survivability and those for • Fault-tolerant distributed systems? • Secure systems? • Our starting point: • Independence assumption goes out the window. • Cost must be included in the equation. Jeannette M. Wing
Determining Survivability Strategies Improved Requirements/ Architecture Survivable Network Analysis System Requirements/ Architecture Essential Services Intrusion Effects Mitigation Strategies SEI CERT/CC Intrusion Knowledge Jeannette M. Wing
Two Parts in Cooperation • The Survivable Network Analysis Method (SEI) • Measures existing systems for survivability • Focuses on user and intruder models • Applying Formal Methods to SNA • Applies model checking and other techniques to survivability • Allows systems that are formally specified to be submitted to survivability analysis Jeannette M. Wing
FRB 1 Bank A Bank B Bank C Simple Example: A Banking System FRB 2 FRB 3 MC 2 MC 3 MC 1 b1 b2 c1 a1 a2 Jeannette M. Wing
Network Model Survivability Property Phase 1 Checker Scenario Graph Reliability Query,Cost Query, etc. Analyzer Phase 2 Scenario Set Overview of Our Formal Method Jeannette M. Wing
Phase 1 Network Model = Survivability Property = A set of concurrently executing Finite State Machines. A predicate in CTL. Model Checker = (modified) NuSMV Scenario Graph = A set of related examples. Jeannette M. Wing
Network Model • Processes • Nodes and links are processes (i.e., FSMs) • banks, money centers, federal reserve banks, and links • Communication via shared variables (i.e., finite queues) • representing channels, and hence interconnections. • Failures • Faults represented by special state variable • fault:{normal, failed, intruded} • Links and banks can fail at any time • Failed link blocks all traffic. • Failed bank routes all checks to an arbitrarily chosen money center. • Money centers and federal reserve banks do not fail. Jeannette M. Wing
Survivability Properties • Fault-related • Money never deposited into wrong account. • AG(error) • Service-related • A check issued eventually clears. • AG(checkIssued AF(checkCleared)) Jeannette M. Wing
Each path is a scenario of how a fault can occur. Output: Fault Scenario Graph • Intuition: • Each “counterexample” spit out by the model checker is a scenario. • Survivability property gives a slice of the model. Jeannette M. Wing
Survivability Properties • Fault-related • Money never deposited into wrong account. • AG(error) • Service-related • A check issued eventually clears. • AG(checkIssued AF(checkCleared)) Jeannette M. Wing
up(a2) down(a2) & up(a1) up(c1) A Service Success Scenario Graph issueCheck(A, C) send(A, MC-2) send(A, MC-1) send(MC-2, FRB-1) send(MC-1, FRB-2) send(FRB-1, FRB-3) send(FRB-2, FRB-3) send(FRB-3, MC-3) send(MC-3, C) debitAccount Jeannette M. Wing
A Service Fail Scenario Graph issueCheck(A, C) down(A) up(A) pick(MC-2) pick(MC-1) down(c1) FAIL up(a2) down(a2) down(a1) up(a1) send(A, MC-2) send(A, MC-1) down(c1) FAIL down(c1) FAIL FAIL Jeannette M. Wing
Annotations(e.g., probabilities, cost) Overview of Method Network Model Survivability Property Phase 1 Checker Scenario Graph Reliability Query,Cost Query, etc. Analyzer Phase 2 Scenario Set Jeannette M. Wing
Phase 2: Reliability Analysis (in a Nutshell) • Annotations = Probabilities • Use Bayesian Networks to model dependence of events. • Symbolic • Use symbolic probabilities • high, medium, low • Use NDFA theory to compute scenario set. • Continuous • Use numeric probabilities • [0.0, 1.0] • Use Markov Decision Processes to model both nondeterministic and probabilistic transitions. Jeannette M. Wing
{medium, low} high Phase 2a: Symbolic Analysis Annotated Scenario Graph = Reliability Query = Bayesian Network + Scenario Graph Regular Expression (DFA) Composer = ASG + DFA Scenario Set = High-risk scenarios Jeannette M. Wing
M M M M M M L H Annotated Scenario Graph issueCheck(A, C) down(A) up(A) pick(MC-2) pick(MC-1) down(a1) up(a1) down(a2) down(a2) FAIL Jeannette M. Wing
Phase 2b: Continuous Analysis • Use realvalues forprobabilities. • May leave probabilities of some events unspecified. • Markov Decision Processes • Mix of nondeterministic and probabilistic transitions • Why? System is not closed. • Hard to assign probabilities to some faults (e.g., intrusions). • Environment makes choice (i.e., decision) and can be demonic! Jeannette M. Wing
Reliability Analysis Goal of (malicious) environment: Devise an optimal policy to minimize reliability. • Assign to each state, s, a value, V(s), computed using a standard policy iteration algorithm from MDP literature. • Let V* be the value function after convergence. Then, for initial state of scenario graph, s0, V*(s0) computes worst-case probability of service eventually finishing. Jeannette M. Wing
0.7 0.4 0.65 0.6 0.3 Good Bad A Typical Example 0.6 0.7 0.6 V(Bad) = 0.0 V(Good) = 1.0 Jeannette M. Wing
1/4 3/8 up(a2) down(a2) & up(a1) 1/2 up(c1) A Service Success Scenario Graph issueCheck(A, C) send(A, MC-2) send(A, MC-1) send(MC-2, FRB-1) send(MC-1, FRB-2) send(FRB-1, FRB-3) send(FRB-2, FRB-3) send(FRB-3, MC-3) The worst case probability that a check issued by Bank A on Bank C is (1/2 * 3/8) + (1/2 * 1/4) = 5/16 send(MC-3, C) debitAccount Jeannette M. Wing
Phase 2c: Latency and Cost Analysis • Latency Analysis • Associate with each edge in scenario graph an immediate cost (e.g., time it takes to execute event). • Q: What is the worst case latency scenario? • Cost Analysis • Identify new actions that correspond to decisions an architect needs to make. • Associate a cost with each action. • Define constraints on costs. • Q: Which set of links can I afford to upgrade to achieve higher reliability, given my cost constraints? Jeannette M. Wing
Constrained Markov Decision Processes <S, A, P, c, d> • S is a finite state space. • A is a finite set of actions. • P are transition probabilities. Psas’ is the probability of moving from state s to s’ if action a is chosen. • c: (S x A) is the immediate cost. c(s, a) is the cost of choosing action a at state s. • d:(S x A) is a k-dimensional vector of immediate costs, captures additional cost constraints. Jeannette M. Wing
Progress To Date: Tools • Trishul tool • Uses NuSMV model checker, done by Somesh Jha. • History variable explodes state space, leading to… • …New tool • Uses SPIN, ongoing by Oleg Sheyner. • No need for history variable. Jeannette M. Wing
Progress To Date: Case Studies • Trading floor model of major investment bank (being “sanitized”) • 10K lines of NuSMV • half-million nodes in scenario graph • 50 threat scenarios • 45 found by system • 5 new threat scenarios found • With independence assumption, too many misses. • B2B e-commerce NYC start-up (Jha) • 50K lines of Statecharts • 2 million NuSMV beyond capability of tool • Lincoln Labs example (Sheyner) • TBD Jeannette M. Wing
Next Steps • Show applicability of the CMDP model for other critical infrastructure examples. • Via Lincoln Labs connection • Combine with other tools to further automate the analysis. • Linear programming package, theorem provers, … • Integrate with informal SEI Survivability Network Analysis Method • Via case studies Jeannette M. Wing
References • Applying Formal Methods to SNA • Jha and Wing, “Survivability Analysis of Networked Systems,” to appear, Proceedings of the International Conference of Software Engineering, 2001. • Survivable Networking Analysis • Ellison, Linger, Longstaff, and Mead, “Survivable Network System Analysis: A Case Study,” IEEE Software, July/August 1999. • The Vigilant Healthcare System • Ellison, Fisher, Linger, Lipson, Longstaff, and Mead, “Survivability: Protection Your Critical Systems, IEEE Internet Computing, November/December 1999. • Web site: IEEE article and other reports www.sei.cmu.edu/organization/programs/nss/surv-net-tech.html Jeannette M. Wing
Other OASIS Connection • Recovery service for PASIS (Greg Ganger, Pradeep Khosla, Carnegie Mellon) • Anticipate intrusion • Proactive secret-sharing • Upon intrusion detection • Reactive secret-sharing Jeannette M. Wing