340 likes | 552 Views
Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains. Hichem Boudali 1 , Pepijn Crouzen 2 , and Mari ë lle Stoelinga 1 . 1 Formal Methods and Tools group CS, University of Twente, NL . 2 Dependable Systems and Software group, CS, Saarland University, Germany.
E N D
Dynamic Fault Treeanalysis usingInput/Output Interactive Markov Chains Hichem Boudali1, Pepijn Crouzen2, and Mariëlle Stoelinga1. 1Formal Methods and Tools group CS, University of Twente, NL. 2Dependable Systems and Software group, CS, Saarland University, Germany IPA Lentedagen, Rhenen
Introduction:Dependability Dependability: The trustworthiness of a computer system such that reliance can justifiably be placed upon the service it delivers. Reliability: The probability that a computer system does not fail within a given time bound. IPA Lentedagen, Rhenen
Introduction:Formal dependability • Continuous-time Markov chains (CTMC) • States and Markovian transitions • Probability of traversing a λ-transition within t time-units is: 1-e-λt • Tools: Reachability analysis (among others) μ λ μ λ IPA Lentedagen, Rhenen
Introduction:CTMC characteristics • CTMCs describe probability distributions (phase-type distributions) • Phase-type distributions can approximate any arbitrary distribution arbitrarily closely • Goal: Find a CTMC which describes the probability of system failure within t time-units (i.e. the unreliability of the system) • Problem: Difficult to find the CTMC that models a large system μ λ μ λ IPA Lentedagen, Rhenen
Introduction:Engineering dependability • Fault Trees (1960’s) • Graphical • Easy to use • Syntax: • Basic events • Gates • Semantics: logical formula • Problem: Not expressive enough Workstation fails OR AND CPU fails Mem1 fails Mem1 fails IPA Lentedagen, Rhenen
Introduction:Engineering dependability SPARE • Dynamic Fault Trees (1992) • Extension of classic fault trees • Additions: • Use of spares • Dependencies • Order-based failure • Tools: • Convert to CTMC System failure OR S P1 P2 IPA Lentedagen, Rhenen
But…DFT Drawbacks • Scalability • Ambiguous syntax and semantics • Lack of modularity: • Dynamic modules can not be reused • Restrictions on spares and dependencies • Existing analysis technique is hard to extend or modify IPA Lentedagen, Rhenen
Outline • Case study: FTPP system • DFT approach • Formalizing DFTs • DFT semantics in I/O-IMCs • Deep compositionality • Extending the DFT formalism • Conclusion • Future work IPA Lentedagen, Rhenen
Case study: FTPP • 16 processors divided into 4 groups • 4 network elements connect the processors • Per group 2 processors must be operational • Different configurations are possible A B C D NE1 A A B B N E 2 N E 4 C C D D NE3 A B C D IPA Lentedagen, Rhenen
Case study: FTPP D D D D • 16 processors divided into 4 groups • 4 network elements connect the processors • Per group 2 processors must be operational • Different configurations are possible • Dynamic redundancy management is possible A A A A NE1 B S How reliable is each configuration? B S N E 2 N E 4 B S B S NE3 C C C C IPA Lentedagen, Rhenen
FTPP DFT System Failure OR A A A A Group 1Failure Group 2Failure Group 3Failure Group 4Failure NE1 B S 2/3 2/3 2/3 2/3 B S N E 2 N E 4 B S A B C A B C A B C A B C FDEP FDEP FDEP FDEP B S S S S S NE3 C C C C NE1 NE2 NE3 NE4 A A A A B B B B C C C C S S S S IPA Lentedagen, Rhenen
Existing DFT analysis[Dugan et al. 1992] Unreliability = Prob[Reaching in time T] For static fault trees binary decision diagrams can be used! Otherwise: Convert the DFT into a CTMC. Analyze CTMC using standard solution techniques. A has failed B is operational But… State space explosion: CTMC grows exponentially FTPP difficult to analyze C Starting state: A is operational B is operational AND-gate 0.4 0.2 A has failed B has failed 0.2 Failure rate: 0.4 f/h Failure rate: 0.2 f/h 0.4 A B A is operational B has failed Pr(A fails in T hours) = 1 – e-0.2•T A’s Mean time to failure = 1/0.2 = 5 hours IPA Lentedagen, Rhenen
FTPP Results System Failure A A A A Group 1Failure Group 2Failure Group 3Failure Group 4Failure 2/3 2/3 2/3 2/3 NE1 NE1 NE2 NE3 NE4 B S A B C A B C A B C A B C A A A A B B B B C C C C S S S S B S S S S S N E 2 N E 4 B S FDEP FDEP FDEP FDEP B S NE3 C C C C IPA Lentedagen, Rhenen
What’s behind it? • Model local behavior • We need compositional Markov chains • Combination of LTS and CTMC, with I/O automata features • Markovian transitions (CTMC) • Interactive transitions (LTS) • Action signature (IOA) • ? - Input actions • ! - Output actions • ; - Internal actions λ I/O-IMC for Basic event failed! Input/Output Interactive Markov Chains (I/O-IMC) IPA Lentedagen, Rhenen
Input/Output Interactive Markov Chains • Properties of IMCs: • Combines stochastic behavior and interactive behavior orthogonally • CSP-style synchronization + interleaving semantics • Maximal progress for internal transitions • Properties of IOIMCs: • Unique outputs • Input enabledness • Outputs cannot be blocked! • Maximal progress for output transitions τ λ IPA Lentedagen, Rhenen
DFT semanticsDFT gate to I/O-IMC f(B)? f(A)? f(C)! f(A)? f(B)? f(B)? f(A)? f(C)! f(B)? IPA Lentedagen, Rhenen
What is deep compositionality? f(G1) Group 1Failure 2/3 A B C S f(NE1) f(NE2) f(NE3) f(NE4) • Semantics of a DFT arises naturally ascomposition of the semantics of its building blocks f(G1) f(NE1) … f(NE4) • But: This may lead to huge models. IPA Lentedagen, Rhenen
Why use deep compositionality? • Formally define semantics • Many useful techniques • Combining models: Composition • Refining models: Hiding • Minimizing models: Bisimulation • Reusing models: Renaming • Well supported by CADP toolset (VASY/INRIA) Combat State-space explosion IPA Lentedagen, Rhenen
Compositional Aggregation Translation Composition + Abstraction Repeat Aggregation (minimization) Result: System failure probability Aggregatedsystem CTMC (CTMDP) Analysis IPA Lentedagen, Rhenen
Compositional AggregationExample f(A)? f(B)? f(C)! f(A)? f(B)? f(B)! f(A)! 0.4 0.2 Failure rate: 0.2 f/h Failure rate: 0.4 f/h IPA Lentedagen, Rhenen
Compositional AggregationParallel Composition 2||3 f(A)! 1||2 f(C)! f(B)? Inputs: f(A)? and f(B)? Outputs: f(C)! f(A)! 0.2 1||1 4||3 5||3 f(B)? 0.2 Synchronize on f(A) 3||2 f(B)? Inputs: none Outputs: f(A)! 3||1 C 2 f(B)? f(A)? 5 4 1 f(C)! 3 f(A)? f(B)? C||A A 3 1 2 f(A)! 0.2 IPA Lentedagen, Rhenen
Compositional AggregationAbstraction (hiding) C 2||3 f(A)! f(A); 1||2 f(C)! f(B)? f(A)! f(A); 0.2 1||1 A B 4||3 5||3 f(B)? 0.2 3||2 f(B)? Abstraction (hiding): Makes signal internal 3||1 IPA Lentedagen, Rhenen
Compositional AggregationAggregation (weak bisimulation) Aggregation: Finding a smaller model equivalent (behaviorally) to the original 2||3 f(A); 1||2 f(C)! f(B)? f(A); 0.2 1||1 4||3 5||3 f(B)? 0.2 Weak bisimulation: Disregard internal steps 3||2 f(B)? 3||1 IPA Lentedagen, Rhenen
Compositional AggregationExample (continued) 4||3 2||1 f(C)! f(B)! f(B)! 0.2 1||1 3||3 5||3 1||2 C||A 2 f(B)? 0.2 5 4 1 f(C)! 3 0.2 f(B)? 0.4 2||2 C||A||B 0.4 0.2 B 0.2 3 1 2 f(B)! 0.2 IPA Lentedagen, Rhenen
Compositional AggregationExample (continued) f(C)! 0.2 0.4 C||A||B 0.4 0.2 IPA Lentedagen, Rhenen
DFT extensions • Extensions: • Inhibition • Repair-policies • Complex spares • Complex dependencies • … • Adding extensions in the compositional framework is easy: • Modify translation of DFT building blocks • Compositional aggregation algorithm is unaltered DSN07 Free! IPA Lentedagen, Rhenen
Extension: Repair Basic event A AND-gate C λ r(B)? r(A)? r(B)? r(A)? f(A)! r(A)! r(C)! µ f(B)? f(A)? r(C)! f(C)! f(A)? f(B)? r(C)! r(A)? r(B)? r(B)? r(A)? IPA Lentedagen, Rhenen
Conclusion:How we tackled drawbacks • State-space explosion. • Ambiguous syntax and semantics. • Lack of modularity: • Dynamic modules can not be reused. • Restrictions on spares and dependencies. • Existing analysis technique is hard to extend and/or modify. Compositional Aggregation DAG I/O-IMC Formal translation Renaming! Lifted! Extensions at thelowest level IPA Lentedagen, Rhenen
Future work • Fully automated tool (CORAL) • More aggressive state reduction • Recent work: specialized acyclic algorithm • Apply deep compositionality to more advanced engineering formalisms! (see Boudali et al., DSN08) • Extend DFT formalism • Repair • Failure modes • Non-exponential failure distributions • Sophisticated dependencies IPA Lentedagen, Rhenen
The end! Questions? IPA Lentedagen, Rhenen