190 likes | 348 Views
Presentation of Failure-Oblivious Computing vs. Rx. OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th , 2006. Agenda. Introduction Failure-Oblivious Computing Rx: Treating Bugs As Allergies. Introduction. Problem Reliability (deterministic and non-deterministic)
E N D
Presentation of Failure-Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4th, 2006
Agenda • Introduction • Failure-Oblivious Computing • Rx: Treating Bugs As Allergies
Introduction • Problem • Reliability (deterministic and non-deterministic) • Cause • Software defects account for up to 40% of system failures • Memory- and concurrency related bugs cause more than 60% of system vulnerabilities • Effect • Expensive
Introduction • Solutions • Safe languages, e.g. ML, Java or C# • Rebooting/restarting • Whole program restart, micro rebooting, etc. • Check pointing and recovery • Check point, roll back on failure, re-execute • Application specific • Multi-process model, exception handling, etc. • Non-conventional approaches • E.g. failure-oblivious computing
Failure-Oblivious Computing • An instance of acceptability-oriented computing: • A flawed system must ensure that it respects basic acceptability properties, e.g.: • System must never accelerate the vehicle beyond a specific velocity • System should continue to execute even if it has a memory error • Makes invalid memory accesses oblivious • Invalid reads return manufactured values • Invalid writes are discarded • Thus, no termination of processes or exceptions
Failure-Oblivious Computing, cont. • Behavior • Standard Compilation • memory corruption, potential crash • Safe Compilation • process terminates without potentially contaminating global data • Failure-Oblivious Compilation • process continues execution, speculative, unsafe execution path
Failure-Oblivious Computing, cont. • Example, Pine 4.44 • Index uses From field of messages • Quotes certain characters • Bug when quoting certain values • Maximum length is miscalculated, thus a too small buffer is allocated for quoted value • Standard and Safe: Pine crashes on start • FOC: Pine operates “normally”
Failure-Oblivious Computing, cont. • Example, bug-server (fictional) • FOC uses malloc/free to monitor memory access • Memory deallocation takes up much time,bug-server2.0 uses memory pools: • pool *new_pool()creates a new pool for memory allocation • void *pool_alloc(pool *p, size_t size)allocates size bytes from the pool p • void free_pool(pool *p)frees all memory allocated to pool p • Pools internally use malloc to create new or extend pools, free to free pools • A security exploit is released, affects only 2.0, why?
Failure-Oblivious Computing, cont. • Extension to gcc • Implemented using checking code and continuation code • Checking code evaluates whether a memory access is valid or not • Continuation code executes when an invalid memory access occurs • Discards erroneous writes • Manufactures a sequence of results for erroneous reads, [0, 1, 2, 0, 1, 3, 0, 1, 4, …]
Failure-Oblivious Computing, cont. • Checking code • based on Jones and Kelley’s scheme • enhanced by Ruwase and Lam • Jones and Kelley’s scheme • A table maps locations to data units • A data unit is e.g. a struct, array, variable • The table tracks intended data units and is used to distinguish in-bounds from out-of-bounds pointers
Failure-Oblivious Computing, cont. • Base Case – always in-bounds • Base pointer is the address of an array, struct or variable. • Intended data unit is the corresponding data unit of base pointer • Pointer Arithmetic • Starting pointer + offset • In-bound if and only if starting pointer and derived pointer point to the same data unit • Intended data unit is the same for both • Does not work with “reverse” pointer arithmetic? • Pointer Variables • In-bound if-and-only if it was assigned to in-bound pointer • Intended data unit is the same as the pointer to which it was assigned
Failure-Oblivious Computing, cont. • Valid out-of-bounds pointer • Points to the next byte after intended data unit • Obtained by padding each data item with an extra byte • Illegal out-of-bounds-pointer have value ILLEGAL (-2) • Used to support valid out-of-bounds pointers in terminating loops when using pointer arithmetic
Failure-Oblivious Computing, cont. • Dereferencing pointer, checks table: • in-bounds pointer returns referent value • out-of-bounds pointer causes program to halt with error • Does not support pointer arithmetic used to obtain a pointer to a location past the end of intended data unit, which is then used to calculate an in-bound
Failure-Oblivious Computing, cont. • Ruwase and Lam’s enhancement • Out-of-bounds pointers are set to point to out-of-bounds (OOB) object • OOB object: • Start address of intended data unit • Offset from this address • Can track out-of-bounds pointers to their intended data unit
Failure-Oblivious Computing, cont. • Pros • Global state is not corrupted • Local data accessed in loops • Individual iteration failures can be handled • Servers without state • No propagation of errors beyond a single request • Interactive programs • Programs do not crash • Can show meaningful results • Tolerable slow-down
Failure-Oblivious Computing, cont. • Cons: • “safe compiler for C” • What if this introduces bugs? • Only C? • Programs must be recompiled • Always in use, not only when needed • Manufactured reads can lead to wrong execution path, i.e. not for correctness-critical applications • Only tested in the case of Midnight Commander
Failure-Oblivious Computing, cont. • Cons, cont. • “The key question is how (or even if) the incorrect or unexpected result may propagate through the remaining computation to affect the overall results of the program” • How to determine this is not answered • Vaguely mentions that FOC is less appropriate for such cases • Global change, thus might only be suited for isolated functionality, i.e. local
Failure-Oblivious Computing, cont. • Cons, cont. • Patch-management • Rather have a fixed system than one which seems to run fine, but might not • “Lucky” cases: • Pine – different method used elsewhere • Sendmail – length-check catches error • Midnight Commander – dangling link minimizes error • Mutt – server returns “does not exist”
Failure-Oblivious Computing, cont. • Performance • Programs that would crash earlier continue execution • Slowdown from 1.03 to 8.1 times the original performance