370 likes | 480 Views
OS II: Dependability & Trust Testing Drivers. Prof. Neeraj Suri Constantin Sârbu Dept. of Computer Science TU Darmstadt, Germany. Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de. So far: Verification & Validation Testing Techniques Static vs. Dynamic
E N D
OS II: Dependability & Trust Testing Drivers Prof. Neeraj Suri Constantin Sârbu Dept. of Computer Science TU Darmstadt, Germany Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
So far: Verification & Validation Testing Techniques Static vs. Dynamic Black-box vs. White-box Testing of dependable systems with Modeling Fault-injection (FI / SWIFI) Some existing tools for fault injection Last time: Testing (SWIFI) of operating systems WHERE: Error propagation in OSs [Johansson’05] WHAT: Error selection for testing [Johansson’07] WHEN: Injection trigger selection [Johansson’07] Today: Profiling the OS extensions (drivers) State definition State changes at runtime Behavior-driven test prioritization Fault Removal: Software Testing
Recap: The driver problem • Device drivers • Numerous: 250 installed (100 active) drivers in XP/Vista • Large & complex:70% of Linux code base • Immature: every day 25 new / 100 revised versions Vista drivers • Access Rights: kernel mode operation in monolithic OSs • Device drivers are thedominant causeof OS failuresdespite sustained testing efforts Causes of WinXP outages Causes of Win2k outages
Recap: The driver problem • Problem statement:Driver failures lead to OS API failures • Mitigation approaches • Harden OS robustness • Improve driver reliability
Recap: The driver problem The problem in terms of error propagation The effect of robustness hardening in terms of error propagation The effect of testing in terms of error propagation
Main topic of today‘s lecture What if we cannot remove defects (e.g. commercial OSs)? • Goals: • Test case prioritization for black-box components • (Limited) means to remove them: avoid fault activation
OS Robustness Testing Efforts at DEEDS • Our research topics presented today: • “Improving Robustness Testing of COTS OS Extensions” (ISAS’06) • What is the state of an OS (component)? • Efforts to define the state of drivers • How a driver behaves at runtime? • Current research: Test prioritization! • Minimize testing effort based on behavior patterns • Bachelor/Master/Diplom/PhD Theses opportunities! http://www.deeds.informatik.tu-darmstadt.de/aja/
On Robustness Testing of OS Device Drivers (Preliminaries)
Outlook • Introduction • System model • Windows Driver Model (WDM) Structures • Robustness Testing Approach • Benefits
About drivers • Whatis a driver? • a collection of functions for controlling HW, provided in general by a HW supplier • implemented in general as Dynamic Link Library (DLL, SYS file) • Whatdoesa driver? • Imports other libraries • Exports its own functions (like public methods) • Communicates with HW Windows drivers !
Different Driver Communication Models Operating system Kernel memory space Y.DLL X.DLL DRIVER.DLL
Robustness testing using SWIFI Limitations of previous techniques: - simplified error model (bit-flips, parameter errors) - huge number of test cases • Need a new technique able to: • treat drivers as black-boxes; • reduce testing time; • increase functional coverage.
System & I/O Model Application n “read file f” Application 1 Application n … System Services User space I/O Manager System Services “read(hnd, size, buf)” I/O Manager Other facilities Driver Driver Driver Kernel space “IRP_MJ_READ” Driver Driver Physical Hardware Physical Hardware Hardware “spin HDD plates, move head h, start read…”
WDM (Windows Driver Model) Basics • A framework for device drivers that operate under MS Windows 98/ME/2K/XP and Server 2003 • Designed for forward compatibility across Windows versions • Feature-set partitioning: a WDM driver must implement some standard routines, the rest is application dependent • Provide not only protocols, but tools and documentation (Driver Developer Kit - DDK)
Driver lifecycle Driver not loaded OS loads driver code in virtual memory OS creates an empty DRIVER_OBJECT OS runs DriverEntry function of the driver IRP Driver READY Driver WORKING Status • WDM important entities: • DRIVER_OBJECT • IRP (MJ, MN, IOCTL)
WDM Structures: 1. DRIVER_OBJECT WDM.H: typedef struct _DRIVER_OBJECT { CSHORT Type; CSHORT Size; ... } DRIVER_OBJECT, *PDRIVER_OBJECT;
WDM Structures: 2. IRP & I/O Stack Location IRP dependent Status of an operation I/O Stack location • ~ 28 major function IRPs: • 3 have minor IRPs • 2 have HW specific control codes IRP Structure
FSM Idea Driver not loaded CREATE CLOSE OS loads driver code in virtual memory READ OS creates an empty DRIVER_OBJECT WRITE OS runs DriverEntry function of the driver DEVICE_CONTROL MN IRP MN Driver READY Driver WORKING MN Status • Serial.sys: • 9 MJ • 2 MN (POWER) • 37 (DEVICE_CONTROL) • 4 (INTERNAL_DEVICE_CONTROL) • ------------------------------------------ • 52 “modes”
Current focus • Finding a minimal graph to describe driver functionality • Determining the sub-graph(s) with maximum impact on system’s robustness • Finding a proper fault model(s): sequences, etc. • Is this approach scalable/portable?
Improvements over traditional testing techniques • Benefits: • error model based on actual communication level • good functionality coverage • can be used as an aid for fault-injection techniques • can be used for smart placing of EDMs & ECMs
Improving Robustness Testingof COTS OS Extensions Constantin Sârbu, Andréas Johansson, Falk Fraikin and Neeraj Suri Department of Computer Science TU Darmstadt, Germany Presented at ISSRE 2006
Outline • System Model and MS Windows Driver Model • Driver Mode and Operational Profile • Coverage Metrics for Testing • Case Study: The Serial Driver (Windows XP)
Why Driver Testing? • OS extensions (drivers) • COTS components enhancing OS’s adaptability • collection of subroutines for controlling HW • reside in kernel space • Strong OS robustness impact • fast developed software high defect density • ~70% of Linux kernel code, > 35 000 different drivers for Windows XP* • often unrestricted interference with OS • major OS failures cause (~85% of SW related failures in Windows XP*) Test the drivers better! • * Improving the Reliability of Commodity Operating Systems, M. M. Swift et al., SOSP 2003
Driver Testing • Testing of Drivers (developers and users): crucial but difficult • limited access: located in kernel space • driver users have limited access to source code • set of loaded drivers is different across installations • Common Driver Testing Philosophies: • Microsoft approach: Driver Reliability Signature (DRS) program • DDK (Driver Developer Kit) and HCT (Hardware Compatibility Test) • based on “fault checklists” • tests available to driver developers • SWIFI (SW Implemented Fault Injection) • inject artificial faults and observe outcome • reboot system to inject into the same “state” • not considering driver’s operational state • Multiple at varied design/abstraction levels … (functional, behavioral…) How good is a testing method?
Microsoft Tests (fault checklists) State Space and Operational Profile Operational Profile (dependent on workload) Driver’s State Space SWIFI Faults (tend to cluster*) We need testing methods matching the operational profile * An Empirical Investigation of Software Fault Distribution, K. H. Möller and D. Paulish, SMS 1993
System Model • (Currently) MS Windows XP SP2 as case study • The set of applications is known USER SPACE Application p Application 1 … System Services KERNEL SPACE Driver 1 Driver 2 I/O Manager Other OS Facilities Driver 3 Driver m Hardware Layer HW SPACE • Drivers interact with the rest of the system via I/O Manager • Windows Driver Model (WDM) specifies the communication interface between I/O Manager and the drivers
I/O Request (IRP) I/O Request Handler Result & Status Windows Driver Model (WDM) • Unified interface between OS kernel and drivers • I/O Request Packet (IRP) • communication media between I/O Manager and drivers • I/O Manager builds IRP request and pass it to a driver • driver executes associated code and returns the result using the same IRP instance • Each driver • contains a set of procedures, each one executed when a particular request was received • publishes a list with entry points to the respective procedures • A driver can execute several IRP requests concurrently DRIVER
{ 1, if performing the functionality triggered by IRPi P(IRPi) = 0, otherwise IRP2 I/O Request Handler Result & Status Driver Mode • At time t, the mode of a driver is a tuple of predicates, each assigned to one of the n IRPs the driver supports: MD: < P(IRP1) P(IRP2) P(IRP3) … P(IRPn) > Example: a driver supporting 4 distinct IRPs: IRP1 IRP2 IRP3 IRP4 MD: < 0 0 0 0 > 1 DRIVER
# of Active IRPs bidirectional edges 0000 0 0000 1000 0100 0010 0001 1 1000 0100 1000 0100 0010 0001 1100 1010 2 1001 0011 0110 0101 1100 1100 0011 1110 1101 1011 0111 3 1110 1101 1111 4 Driver’s State Space • The driver’s state space is represented by the set of all possible driver modes • The operational profile is defined by the set of visited modes • Total number of modes: N = 2n • Total number of transitions: T = n·N = n·2n Assumption: at any instant of time, only one IRP can be received or finished by the driver
0000 1000 0100 0010 0001 1100 0011 1111 1010 1001 0110 0101 1110 1101 1011 0111 Testing Coverage Metrics • Ideal testing technique should test 100% of the operational profile! 1. Mode Coverage: every visited mode is tested MC = |tested modes ∩ visited| / # of op. profile modes 2. Transition Coverage: for every visited mode, all outgoing traversed transitions are tested TC = |tested transitions ∩ traversed |/ # op. profile transitions 3. Path Coverage: traverse all the paths between two visited modes, over any number of hops
Logs IrpTracker Case Study: The Serial Driver (Windows XP) • Experimental setups: • Pentium4 @2.8Ghz • serial modem (external) • cables (serial, loopback) • various benchmark software • How much of the mode graph is actually visited? Workload App. USER SPACE Serial Driver I/O Manager Communication Party: - 56k Modem - 2nd computer - loopback cable KERNEL SPACE Serial Port HW SPACE • Serial driver: • serial.sys, provided together with Windows XP Professional SP2 • digitally signed by Microsoft • passed the reliability and stress tests included in HCT (Hardware Compatibility Tests) and DDK (Driver Development Kit)
Experiment 1 – Driver-Usage Pattern • Used a commercial modem benchmark as workload • get / set serial port settings • send / receive data • traffic is verified for completeness and correctness • Assumed that: • the generated load is representative for normal operational mode of the driver • the sequence of IRPs is repeatable • Small operational profile: only 7% of modes and 1.8% of transitions were visited -> consistent!
Experiment 2 – Aggregated Workload • Workload: a set of 7 applications that generated a total of: • 107456 requests (10 distinct) • a total of 1024 modes / 10240 transitions • Operational profile: only 1.66% of modes and 0.34% of transitions were visited -> indicate where to focus testing Observations: • some modes are visited much more frequently than others • only modes located on first levels are visited (11 levels) • existence of loops!
Discussion • Operational profile • only a very small amount of modes are actually visited under a given workload • it indicates the modes and transitions with high likelihood to be reached in the field test those preferentially! • not many IRPs were executed concurrently • short IRP sequence to bring the driver in the desired mode • IRP sequences • generating those can be problematic (receipt and return occur non-deterministically) • Wave Testing: first test visited modes, then their one-hop neighbors by trying to traverse new edges • Limitations • cannot deal with parallel processing of several IRPs of the same type • assumes sequential start/finish of IRPs (no jump over one level)
Conclusions & Future Work • Our contribution provides “means” to identify relevant locations for focused/effective testing (& for black-box SW!) • Requires no modifications of the OS or driver source code • Assist the debugging process (we have information about which subroutine is running at a given moment) • Future work • Representative set/classes of drivers, OSs • Build operational graphs complementing MS testing tools (is Microsoft testing enough?) • Application profiling (build behavioral patterns for driver usage)