210 likes | 452 Views
The ANSA project. Failures and Dependability in ANSA. System structure. Component based: component behaviour can be observed by other components Independent components: own observations and reasoning about events No global observer No global ordering of events No global time.
E N D
The ANSA project Failures and Dependability in ANSA
System structure • Component based: component behaviour can be observed by other components • Independent components: own observations and reasoning about events • No global observer • No global ordering of events • No global time
Expectations – I V An event with value v0 is expectedin time interval t0 and t1 v0 t0 t1 T
Expectations – II V An event with a value between v0 and v1 is expected in time interval t0 and t1 v1 v0 t0 t1 T
Expectations – III V An event with a value between v0 and v1 is expected in time interval t0 and t1 The event value is time dependent v0 E V x T v0 t0 t1 T
Occurrences V An event can occur exactly once in the ANSA model v0 O0 v0 t0 t1 T
Occurrences V An event can occur exactly once in the ANSA model v0 O1 O V x T |O| = {0,1} v0 t0 t1 T
Correctness • Correct occurrence of an eventO E • Correct non-occurrence of an eventO E = • Formal definition of correctness(O E ) (O E = )
Failures • Negation of correct event(O E ) (O E = ) • Simplified(O E ) (O E = ) • Unexpected occurrenceO E = • Omission failureE O = • Incorrect occurrenceO E (O E = )
Consistency between multiple events • Events constrain the expectation of future events • Local events: Observation by local mechanisms of a component • Distributed events: Distributed consensus problem, collaboration of components required • Consistency enforcement instead of distributed deviation detection • Express global properties as a set of local ones
Computability of next expectation • Research questions: • Does a function f(O) exist to compute the next expectation? • How many such functions are need for a simple protocol? V V v1 O0 v0 v3 v2 t1 T t3 T t0 TO TO t2
Computability of next expectation • Research question: • Does a function g(O) exist to compute the next expectation in case of a failure? V V v1 v0 v3 O0 v2 t1 T t3 T t0 TO TO t2
Dependability Principles – I • Separation: More (distributed) components reduce dependability • Diversity: Designers need to be prepared and mechanisms need to allow for diversity • Scaling: Mechanisms must be exchangeable to suit different scenarios
Dependability Principles – II • Federation: heterogeneous authorities and dependability contracts • Transparency: hide dependability mechanisms from the programmer • Concurrency: conflicting, inconsistent changes to data • Configuration: add and update parts of the system; adapt failure detectors
Management Model – I • Fault confinement: limitation of propagation to other parts of the system • Fault detection: compare time/value observation with expectation • Fault diagnosis: if fault detection can not identify the faulty component • Reconfiguration: isolate faulty component or replace with spare • Recovery: remove effect of fault
Management Model – II • Restart: after all damaged state has been removed • Repair: restores the faulty component to an undamaged state • Reintegration: reconfiguration of the system to reintroduce the repaired component
Open questions • Is our list of principles complete? • Separation, Diversity, Scaling, Federation, Transparency, Concurrency, Configuration • Is our D2R3 strategy complete? • Fault confinement, Fault detection, Fault diagnosis, Reconfiguration, Recovery, Restart, Repair, Reintegration • Is our CFEF diagram correct? • Do we detect faults, errors of failures?