460 likes | 583 Views
Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer Research in Software Engineering Microsoft Research. Asynchronous interaction. Collection of state machines communicating asynchronously via message buffers distributed algorithms
E N D
Safe Programming of Asynchronous Interaction: Can we do it for real? ShazQadeer Research in Software Engineering Microsoft Research
Asynchronous interaction • Collection of state machines communicating asynchronously via message buffers • distributed algorithms • cloud infrastructure, services, and applications • event-driven JavaScript/AJAX programs • device drivers • …
Challenging characteristics • Decomposition of a logical task into pieces • Temporally overlapped execution of tasks • Failure tolerance is important • Coordination via protocols
Safety-critical is so 20th century • Software should just “work” • as cloud computing becomes common • as devices get embedded into everyday life • First-order concerns • software reliability • programming, testing, and debugging productivity • cost of achieving reliability and productivity • Need programming techniques to improve reliability and productivity
Outline • Formal design of USB device driver stack in Windows 8 • Challenges (or inspiration) for the future • Domain-specific language, compiler, and verifier for protocol programming
What is USB? • Universal Serial Bus • Primary mechanism for connecting peripherals to PCs • 2 billion USB devices sold every year (as of 2008) • voted most important PC innovation of all time (PC magazine) 1996 2000 2008 USB 1.0 USB 2.0 USB 3.0
USB device driver stack in Win8 OS, drivers HSM PSM PSM PSM DSM DSM DSM Hardware
Design methodology (Aull-Gupta) State Table, Transitions And State Entry Functions In C State Table, Transitions And State Entry Functions In Zing State Machine In Visio Script Script Operations In C Document Operations, Rules And Assumptions Program Operations, Rules And Assumptions In Zing State Machine Engine In C State Machine Engine In Zing
Assumptions/Guarantees State S2 State S1 TimerStart() • Upon calling TimerStart(), machine could receive TimerFired event • S1, S2, and S3 need to handle TimerFired • Upon receiving TimerFired, machine will not receive TimerFired • S4 does not need to handle TimerFired X Y TimerFired State S3 State S4
Zing error trace Check failed ******************************************************************************* Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StartTimer') Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StartTimer') AttributeEvent: Handled Event ___StartTimer, Old State: ___WaitingForCommand, New State: ___StartingTimer Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___TimerFired') Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StopTimer') AttributeEvent: Handled Event ___OperationSuccess, Old State: ___StartingTimer, New State: ___WaitingForTimerToExpire Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___TimerFired') AttributeEvent: Handled Event ___TimerFired, Old State: ___WaitingForTimerToExpire, New State: ___SignallingTimerCompletion AttributeEvent: Handled Event ___OperationSuccess, Old State: ___SignallingTimerCompletion, New State: ___WaitingForCommand Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StopTimer') AttributeEvent: HSM-1: Unhandled Event ___StopTimer, State ___WaitingForCommand] Error in state: Zing Assertion failed: Expression: false Comment: Unhandled Event Depth on error 208
Impact • Unprecedented use of formal design in Windows • Modelis the Source • Over 200 rules to catch regression bugs even before C Code is compiled • Over 300bugs found and fixed • unhandled messages, property violations
Benefits • Model verification complements testing • validates states that are hard to reach with testing • debugging is significantly easier • Explicit specification of contracts • solid design • better documentation and maintenance
Difficulties faced by programmers • Visio inadequate container for state diagrams • Semantics of modeling language embedded inside scripts • No automation for managing properties, models, and lemmas
From modeling to programming • State machine models are programs in a domain-specific language (DSL) • Develop a modern programming environment for a DSL inspired by state machines • Simple syntax/semantics for programs and properties • Code generator and runtime library for execution • Verifier for property checking
Ping Pong machine Ping receives pong { var x: Pong state ( start, x := new Pong(y = this); raise unit ) ( ping1, send(x, ping); return ) transition ( start, unit, ping1 ) ( ping1, pong, ping1 ) } machine Pong receives ping { var y: Ping state ( start, return ) ( pong1, send(y, pong); raise unit ) transition ( start, ping, pong1 ) ( pong1, unit, start ) }
x := new Pong; raise unit unit pong send(x, ping); return
x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
ping x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
pong x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
pong x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
x := new Pong; raise unit return unit ping unit send(that, pong); raise unit pong send(x, ping); return
Unhandled events • Suppose state s only provides the transitions (s, e1, s1) and (s, e2, s2) • Retrieving e3 from input queue results in UnhandledEventException • Absence of UnhandledEventException must be verified
Deferred events • State (s, Stmt, {e1, e2}) • s is in the middle of critical processing waiting for e • Presence of e1 and e2 in the buffer does not cause UnhandledEventException • e1 and e2 are skipped over while retrieving e
Sub-state machines • Statement “call s” pushes state s on the machine stack • s will handle a sub-protocol • Sub-computation inherits deferred events from the caller • Caller given a chance to handle UnhandledEventException
Memory management • When is it safe to free up the memory for a state machine? • Reference counting: Increment, Decrement • A machine is freed only when • its reference count is zero • it is quiescent • Accessing a freed machine causes IllegalAccessException whose absence must be verified
Runtime library • Provides support for • machine creation and deletion • input buffer management • execution of transitions and entry functions • Reactive event-driven computation piggybacked on external threads • locking for coordination among multiple external threads executing within the runtime
Verification • How do we verify the absence of UnhandledEventException and IllegalAccessException? • How do we verify program-specific properties? • How do we specify interfaces?
Automata Automata Automata are used to model implementation and specification. Set of states Transitions: A B Alphabet Initial state { A, B }
Parallel composition Parallel composition is the synchronous product. (trace intersection) Shared transition A A B C Local transition C B A B C
Properties Properties Specifications are monitors that define the set of allowed traces. An implementation is correct if it refines the specifications. Refinement is trace inclusion. B A B B A B
Semantic gap • How do we connect a program to a finite collection of automata communicating via rendezvous over a finite alphabet? • Challenges • dynamic creation of machines • asynchronous message passing • unbounded input buffers
Solution • Dynamic machine creation • finite verification scenario • Asynchronous message passing • separate events for sending and receiving • events tagged by sender and receiver machine ids
Implementations (machines and channels) Send A Pong Buffer Receive A Ping Pong Send B Receive B Send A Receive A Send B Ping Buffer Receive B
Solution • Dynamic machine creation • finite verification scenario • Asynchronous message passing • separate events for sending and receiving • events tagged by sender and receiver machine ids • Unbounded input buffers • compositional verification • finite-state buffer abstractions
Compositional verification is a set of specification automata. is a set of implementation automata. We want to prove (difficult). Compositional verification tells us how we can do: where are subsets of and are subsets of
Simple hierarchical case Hierarchical compositional rule
Implementations (machines and channels) Specification Send A Send A Receive B Receive A Send B Receive B Send A Receive A Send B Receive A Send B Receive B
Decomposing by weakening Weaken by A B A S Weaken(S, A) A B A S = Weaken(S, A) || Weaken(S, B) A
Circular compositional rule Given a spec S, and a set of implementation machines I: If for all E in alphabet of S, there is such that Then .
Pong Send A Receive B refines Send B Receive A Send B Receive A Send A Receive B Send B Send B Send B Send B Send B Receive A
Review • A domain-specific language for programming protocol aspects of asynchronous computations • operational semantics • compiler/runtime for device driver domain • verification
Work in progress • Deliver working prototype to Windows and third-party driver developers • Other applications • cloud infrastructure, services, and applications • networking software • asynchronous web programming • …
Opportunity • Transform protocol design and implementation across a variety of application domains • Target the greatest threat to software reliability in the era of pervasive devices and pervasive distributedcomputing