140 likes | 331 Views
Fault Tolerance: Basic Mechanisms. mMIC-SFT September 2003 Anders P. Ravn Aalborg University. Fault Tolerance. Means to isolate component faults. ... And mask them. Prevents system failures. May increase system dependability. Dependability - means. Fault prevention Fault tolerance
E N D
Fault Tolerance: Basic Mechanisms mMIC-SFT September 2003 Anders P. Ravn Aalborg University
Fault Tolerance Means to isolate componentfaults ... And mask them Prevents systemfailures May increase systemdependability
Dependability - means • Fault prevention • Fault tolerance • Error Removal • Failure Forecasting BW p. 106, ...
Full tolerance • Graceful Degradation • Fail safe FT - levels BW p. 107
Retry ... ... Try Try Try FT basis: Redundancy • Time • Space Try Retry BW p. 109
N-version programming V1 V3 V2 Comparison vectors (votes) Driver (comporator) Comparison status indicators Comparison points BW p. 109
byzantine Fault classification (scope of N-VP) + + (+) ++ (+) + / (+) + / + + / + • physical (internal/external) • logical (design/interaction) • Origin • Kind • Property • omission • value • timing • duration (permanent, transient) • consistency (determinate, nondeterminate) • autonomy (spontaneous, event-dependent)
Dynamic Redundancy • Error detection • Damage confinement and assessment • Error recovery • Fault treatment and continued service BW p. 114
D Error Detection f: State x Input State x Output • Environment (exception) • Application • Assertion: • precondition (input) • postcondition (input, output) • invariant(state, state’) • Timing: • WCET(f, input) • Deadline (f,input) BW p. 115
object I object I Damage Confinement • Static structure • Dynamic structure BW p. 117
Error Recovery • Forward • Backward Repair the state – if you can ! • define recovery points • checkpoint state at r. p. • roll back • retry Domino effect BW p. 118
Recovery blocks ENSURE acceptance_test BY { module_1 } ELSE BY { module_2 } ... ELSE BY { module_m } ELSE ERROR BW p. 120
Failure exception Interface exception Request/response Interface exception Failure exception Request/response The ideal FT-component Normal mode Exception Handler BW p. 126