150 likes | 302 Views
Recovering Device Drivers. Michael Swift, Muthukaruppan Annamalai, Brian Bershad, Henry Levy. Presented by Radu Teodorescu. Motivation. 85% Windows XP crashes - driver related Linux drivers 7X more bugs than the kernel Why do drivers fail?
E N D
Recovering Device Drivers • Michael Swift, Muthukaruppan Annamalai, Brian Bershad, Henry Levy Presented by Radu Teodorescu
Motivation • 85% Windows XP crashes - driver related • Linux drivers 7X more bugs than the kernel • Why do drivers fail? • There are so many of them! (70% of Linux kernel, 35,000 in Windows XP) • Developed by many third party suppliers • Privileged access inside your kernel!
Solution • Change kernel-driver interaction • Detect driver failure • Isolate fault, avoid kernel corruption • Conceal driver failure, service requests • Restart & initialize driver NOOKS SHADOW DRIVERS
Outline • Shadow drivers & NOOKS • Results • Limitations • Discussion
Shadow Drivers • Kernel agents attached to each device driver • Allow transparent restart of failed drivers • Implements both kernel and driver class interfaces Shadow Driver
Shadow Drivers • Passive mode: normal operation • monitor communication driver-kernel • Active mode: fault detected • restart, initialize, transfer state • respond to calls from kernel
Passive mode Active mode
Active Mode Recovery • Stop the failed driver • Reinitialize driver from clean state • Transfer relavant state to new driver • At the same time: service kernel requests!
Shadow Driver Needs • Coordination - management of shadow drivers - Shadow manager • Redirection mechanism - transparent monitoring and recovery - Taps • Isolation service - prevents driver errors from corrupting the kernel - NOOKS • Object tacking service - track kernel objects created by the driver - NOOKS
NOOKS* • Idea: isolate the OS from driver failures • NOOKS functions: • isolation • object tracking • fault detection *SOSP’03
Limitations • Drivers that cannot be reloaded dynamically • Permanent faults • Ad-hoc driver-kernel communication • Irreversible side effects • Fault isolation is hard • Failure detection imperfect
Discussion • Kernel built-in transparent driver recovery? • How would the system be simplified? • Clear bounds between kernel/driver space • Standard communication, clean interface • More stateless drivers, easier to restart • More?