350 likes | 601 Views
The Rare Glitch Project: Scenario Graph Generation and MDP-Based Analysis. Computer Science Department Carnegie Mellon University Pittsburgh, PA. Jeannette M. Wing.
E N D
The Rare Glitch Project:Scenario Graph Generation andMDP-Based Analysis Computer Science DepartmentCarnegie Mellon UniversityPittsburgh, PA Jeannette M. Wing The work on survivability analysis is funded by DARPA. It is done jointly with Somesh Jha (University of Wisconsin) and Oleg Sheyner (CMU graduate student).
(Network) Model (Survivability) Property Phase 1 Checker Scenario Graph Reliability Query,Cost Query, etc. Analyzer Phase 2 Scenario Set Overview of Our Method
Relation to ARO Proposal • Hypothesis: Our two-phase method and tool suite are applicable to analyzing embedded systems. • Rationale • Network model is simpler (bus-based, not end-to-end). • Reliability and cost are important factors in embedded systems design. • Our plan • Enrich and make current tool suite robust. • Apply method to embedded systems examples. • Pursue foundational issues wrt models for reliability and cost analyses.
Survivable Systems • What if • a terrorist hacker brings down the nation’s power grid? • an act of Mother Nature causes the international financial network to fail? • Critical infrastructures • Utilities: gas, electricity, nuclear, water, … • Communications: telephone, networks, … • Transportation: airlines, railways, highways, … • Medical: emergency services, hospitals, … • Financial: banking, trading, …
Survivability • A system is survivable if it can continue to provide end services despite the presence of faults.
Modeling for Survivability Analysis • Our starting point • Handle both benign and malicious faults. • Throw out independence assumption. • Incorporate semantics of end service in model. • Do not necessarily treat nodes and links the same. • Include cost in the model from the start. • Steps in our approach • Model general network topology (nodes and links) • Analyze in two phases for • Functional behavior • Reliability, cost, etc.
Phase 1 Network Model = Survivability Property = A set of concurrently executing Finite State Machines. A predicate in CTL. Model Checker = (modified) NuSMV Scenario Graph = A set of related examples.
FRB 1 Bank A Bank B Bank C Simple Example: A Banking System FRB 2 FRB 3 MC 2 MC 3 MC 1 b1 b2 c1 a1 a2
Network Model • Processes • Nodes and links are processes (i.e., FSMs) • banks, money centers, federal reserve banks, and links • Communication via shared variables (i.e., finite queues) • representing channels, and hence interconnections. • Failures • Faults represented by special state variable • fault:{normal, failed, intruded} • Links and banks can fail at any time • Failed link blocks all traffic. • Failed bank routes all checks to an arbitrarily chosen money center. • Money centers and federal reserve banks do not fail.
Survivability Properties • Fault-related • Money never deposited into wrong account. • AG(error) • Service-related • A check issued eventually clears. • AG(checkIssued AF(checkCleared))
Scenario Graphs • Given a state machine, M, and a property, P, a scenario graph is a concise representation of the set of traces of M with respect to P. • P = fault property • A fault scenario graph represents all system traces that end in a state that does not satisfy P. • P = service property • A service success (fail) scenario graph represents all system traces in which an issued service successfully finishes (fails to finish).
Each path is a scenario of how a fault can occur. Fault Scenario Graph • Intuition: • Each “counterexample” spit out by the model checker is a scenario. • Survivability property gives a slice of the model.
Survivability Properties • Fault-related • Money never deposited into wrong account. • AG(error) • Service-related • A check issued eventually clears. • AG(checkIssued AF(checkCleared))
up(a2) down(a2) & up(a1) up(c1) A Service Success Scenario Graph issueCheck(A, C) send(A, MC-2) send(A, MC-1) send(MC-2, FRB-1) send(MC-1, FRB-2) send(FRB-1, FRB-3) send(FRB-2, FRB-3) send(FRB-3, MC-3) send(MC-3, C) debitAccount
A Service Fail Scenario Graph issueCheck(A, C) down(A) up(A) pick(MC-2) pick(MC-1) down(c1) FAIL up(a2) down(a2) down(a1) up(a1) send(A, MC-2) send(A, MC-1) down(c1) FAIL down(c1) FAIL FAIL
Annotations(e.g., probabilities, cost) Overview of Method Network Model Survivability Property Phase 1 Checker Scenario Graph Reliability Query,Cost Query, etc. Analyzer Phase 2 Scenario Set
Phase 2: Reliability Analysis (in a Nutshell) • Annotations = Probabilities • Use Bayesian Networks to model dependence of events. • Symbolic • Use symbolic probabilities • high, medium, low • Use NDFA theory to compute scenario set. • Continuous • Use numeric probabilities • [0.0, 1.0] • Use Markov Decision Processes to model both nondeterministic and probabilistic transitions.
Phase 2: Continuous Analysis • Use realvalues forprobabilities. • May leave probabilities of some events unspecified. • Markov Decision Processes • Mix of nondeterministic and probabilistic transitions • Why? System is not closed. • Hard to assign probabilities to some faults (e.g., intrusions). • Environment makes choice (i.e., decision) and can be demonic!
Reliability Analysis Goal of (malicious) environment: Devise an optimal policy to minimize reliability. • Assign to each state, s, a value, V(s), computed using a standard policy iteration algorithm from MDP literature. • Let V* be the value function after convergence. Then, for initial state of scenario graph, s0, V*(s0) computes worst-case probability of service eventually finishing.
0.7 0.4 0.65 0.6 0.3 Good Bad A Typical Example 0.6 0.7 0.6 V(Bad) = 0.0 V(Good) = 1.0
P(c1) = 1/2 P(b2) = 1/2 P(b1) = 1/2 c1 b2 b1 P(a2 | a1) = 1/2 P(a2 | a1) = 1/4 Bayesian Network for Bank Example P(a1) = 1/2 a1 a2
1/4 3/8 up(a2) down(a2) & up(a1) 1/2 up(c1) A Service Success Scenario Graph issueCheck(A, C) send(A, MC-2) send(A, MC-1) send(MC-2, FRB-1) send(MC-1, FRB-2) send(FRB-1, FRB-3) send(FRB-2, FRB-3) send(FRB-3, MC-3) The worst case probability that a check issued by Bank A on Bank C is (1/2 * 3/8) + (1/2 * 1/4) = 5/16 send(MC-3, C) debitAccount
Cost-Benefit Analysis Goal: Choose a set of links to upgrade to achieve higher reliability, given my cost constraints (e.g., fixed budget). • Identify new actions that correspond to decisions an architect needs to make (e.g., upgrade a1). • Associate a cost with each action. • Define constraints on costs.
FRB1 ? ? ? Bank A Bank B Bank C Upgrade Links in Banking System FRB 2 FRB3 MC 2 MC 3 MC 1 b1 b2 c1 a1 a2
Cost Constraint Example • Assume: • If we upgrade a1 and c1 then P(a1) and P(c1) both increase to 3/4 . • If a2 is upgraded, then P(a2) is: • P(a2|a1) = 3/4 • P(a2| a1) = 3/8 • Aim: Maximize the worst-case reliability subject to the constraint that at most two links can be upgraded. Solve this non-linear integer programming problem: • xa1 + xa2 + xc1< 2 • Best option: Upgrade a1 and c1. xa1 = 1 and xa2 = 1 7/16 xa1 = 1 and xc1 = 1 39/64 xa2 = 1 and xc3 = 1 9/16
Constrained Markov Decision Processes <S, A, P, c, d> • S is a finite state space. • A is a finite set of actions. • P are transition probabilities. Psas’ is the probability of moving from state s to s’ if action a is chosen. • c: (S x A) is the immediate cost. c(s, a) is the cost of choosing action a at state s. • d:(S x A)kis a k-dimensional vector of immediate costs, captures additional cost constraints.
Survivability Case Studies • Somesh Jha • Trading floor model of major investment bank (being “sanitized”) • 10K lines of NuSMV • half-million nodes in scenario graph • 50 threat scenarios • 45 found by system • 5 new threat scenarios found • With independence assumption, too many misses. • B2B e-commerce NYC start-up (Jha) • 50K lines of Statecharts • 2 million NuSMV beyond capability of tool • Oleg Sheyner • Intrusion detection (ongoing)
Intrusion Detection System Case Study • Done by Oleg Sheyner and Lincoln Labs. • Motivated by hand drawn poster of attack scenarios. • Illustrates only first part of method.
Example of Attack Tree Developed by a Professional Red Team • Sandia Red Team “White Board” attack tree from DARPA CC20008 Information battle space preparation experiment
ip1 ftp IDS adversary sshd ipa database firewall router ip2 ftp Detected • Attack Arsenal • Sshd buffer overflow - remotely get root • Ftp .rhosts file - establish trust between hosts • Remote login - exploit trust between hosts • Local buffer overflow - locally get root Phase 1 Example:Multistage Network Penetration Goal:Root access to host ip2
Network 1 attack host, 2 target hosts with services 3x3 connectivity matrix existence of routing path ability to connect to ftp and ssh services 3x3 trust matrix Adversary Privilege levels for each host Attacks 4 attacks some have multiple flavors NuSMV Statistics 82 bits of state (282 states) <40K representation nodes ~7000 reachable states 2 sec runtime on 1GHz Pentium III 8MB of memory used NuSMV Encoding
Scenario-Generating Properties • Don’t care about detection • AG (adversary.privilege[2] < root) • Want stealth • AG ((adversary.privilege[2] < root) or (IDS.detected))
Checkmate Specification Prism Scenario graph XML XML2nuSMV Counter-examples Explanation Generator Model PVS Abstraction/Refinement Processor Symp The Rare Glitch Tool Suite Checkers and Provers Analysis Engines Specification and Modeling Languages Reliability and Cost Analyzers nuSMV … …
Plan in Relationship to The Rare Glitch Project • Enrich and make current tool suite robust. • Integrate with other existing project tools. • Apply method to embedded systems examples. • Work with Clarke to add reliability analysis to automotive example. • Pursue foundational issues wrt models for reliability and cost analyses. • Understand relationship to other probabilistic models, hybrid models, etc.