360 likes | 473 Views
CS 294-8 Abstraction Functions http://www.cs.berkeley.edu/~yelick/294. Agenda. Administrivia Review of abstraction functions for the memory example History variables Prophecy variables General discussion. Administrivia. Dawson Engler speaking Thursday OSDI paper online Final projects:
E N D
CS 294-8Abstraction Functionshttp://www.cs.berkeley.edu/~yelick/294
Agenda • Administrivia • Review of abstraction functions for the memory example • History variables • Prophecy variables • General discussion
Administrivia • Dawson Engler speaking Thursday • OSDI paper online • Final projects: • Send mail to schedule meeting with me • Poster session: • Scheduled 12/7. Too early? With 262? • Papers due Friday 12/15 • Homework 3
History on Abstraction Functions • Used since the 1970s for reasoning about datatypes, e.g., Tony Hoare’s paper • Used with “representation invariants” by Liskov and others • Abadi and Lamport looked formalized their use in concurrent systems: • When do abstraction functions exist? • Formalized history variables • Example from Herlihy and Wing paper demonstrated need for prophecy variables
Execution Model Reminder • A Spec Module defines a state machine • An execution fragment is s0 s1 s2 • An execution starts in an initial state • Steps are written as (si, p, si+1) p0 p1 p2
Some Executions of WBCache write(2,a) (read(2),a) init write(4,c) (read(3),a) abstract execution of specification concrete execution of implementation
Abstraction Function for WBCache FUNC AF() -> M = RET (LAMBDA (a) -> D = IF c!a => c(a) [*] m(a) FI) • Note that abstraction function maps each state in previous WBCache execution to Memory state
Abstraction Function (Def 1) • An abstraction function F: T -> S has: • If t is any initial state of T, then F(t) is an initial state of S • If t is reachable state of T and (t, p, t’) is a step of T, then there is a step of S from F(t) to F(t’) having the same trace • Same trace means externally visible values.
Representation Invariants • The abstraction function need not be defined on every value of the concrete state, only those reachable in the implementation • Example: implementations of a set • An sorted array • Or, an unsorted array without duplicates • Not every array is a legal set representation
Modeling Failures in Spec • A “crash” can happen between any two atomic actions • Volatile state reset • Stable state unaffected • Add a Crash procedure to a module • Need not be atomic; invoked when there’s a crash • Does a “CRASH” command, which stops current (non-atomic) executions. Nothing else can be invoked until Crash returns. • Crash may do other things after CRASH cmd • Normal operation resumes after Crash returns
Disk Example • In the Disk example (H7, p4): • Write operations are • Ordered • Not atomic (although each block write is) • There is no global volatile state • Crash just executes “CRASH”
Agenda • Review of abstraction functions for the memory example • History variables • Prophecy variables • General discussion
Example1 : Statistical DB Type • Given a “statistical DB” Spec with the operations • Add(v): add a new number, v, to the DB • Size(): report the number of elements in the DB • mean(): report the mean of all elements in the DB • variance(): report the variance of all elements in the DB • Notes have a parameter for the DB element type (V); for simplicity, I’ll use “real.”
Example1 : Statistical DB Type • Implementation 1: • Keep set of all values in the database • Mean and variance are computed when needed • Implementation 2 (optimized): • Use only three values: • integer count, initially 0 // number of elements • float sum, initially 0 // sum of elements • float sumSquare, initially 0 // sum of squares of • // all elements
History Variables • Problem: the specification contains more info than the implementation • Specifically, one can’t recover the values in the db from the 3 state variables • Idea: add some phantom variables to the state of the implementation. • Only for the proof • The operations can update this “phantom” state, but cannot change their behavior based on it.
Proof of 2nd Implementation • Proof idea: • add a variable db to the representation state (for the proof only) • the implementation may update db • Augmented “implementation” has: VAR count := 0 sum := 0 sumSquare := 0 db : SEQ real = {} APROC Add(v) = << count := 1; sum +:= v; sumSquare += v2; db += {v}; RET >>
Proof of 2nd Implementation • Proof: The abstraction function for the augmented DB maps db field to the abstract state • Need to prove the representation invariants: • count = db.size • sum = sum({x | x in db}) • sumSquare = sum({x2 | x in db}) • Invariants prove that the operations behave correctly, e.g., Size returns the right value.
History Variables • In general, we can augment an implementation with history variables such that: • Every initial state of the original machine has a corresponding state with some initial value for the history variables • No existing step is disabled by additional predicates on history variables • A value assigned to an existing component must not depend on the value of a history variable (e.g., return values). • Note: the statDB example is extreme, since the entire spec state is added
Examples of History Variables • Why do history variables arise? • To simplify the specifications • To optimization the implementations • More realistic examples: • Web search • Spec talks about the state of the web: “search” looks at arbitrary subset • Implementation cannot reproduce state of failed nodes, except by “storing” lost state (may be phantom “history” vars) • Others?
Abstraction Relations • An alternative to history variables is to use and “abstraction relation” AR • AR maps each concrete state to a set of possible spec states • For this example, AR maps a the 3 values to the set of all db’s having the given size, sum, and sum-of-squares. • It’s a matter of proof style and taste.
Stuttering Transitions • Recall that Lamport and Abadi considered any 2 executions equivalent if one erases “no ops” I.e., (s, p, s) • Intuition: a single high level operation (e.g., transaction) may be implemented by several smaller steps (atomic in the impl.) • A generalized abstraction function allows for 1 step of T to correspond to 0 or >1 steps of S
Abstraction Function (Def 2) • A generalized abstraction function F: T -> S has: • If t is any initial state of T, then F(t) is an initial state of S • If t is reachable state of T and (t, p, t’) is a step of T, then there is an execution fragment (0 or more step) of S from F(t) to F(t’) having the same trace
Agenda • Review of abstraction functions for the memory example • History variables • Prophecy variables • General discussion
Example 2: NonDet (Toy) • Specification VAR j := 0 APROC Out() -> Int = << IF j = 0 => BEGIN j := 2 [] j := 3 END; RET 1 [*] RET j FI >> • Implementation: • VAR j := 0 • APROC Out() -> Int = << • IF j = 0 => j := 1 • [*] j = 1 => BEGIN j := 2 [] j := 3 • [*] SKIP FI; • RET j >> • Both have traces: 1, 2, 2, 2,… and 1, 3, 3, 3, … • Do we have AF’s in both directions? Notes say: • The “spec” implements the “impl” using the identity abstraction function • The reverse AF can’t be defined
Prophecy Variables • We can augment an implementation T with prophecy variables to produce TP such that: • Every state of T has a corresponding state with some value for the prophecy variables • No existing step is disabled in the backward direction by additional predicates on prophecy variables. • For each step (t, p, t’) of T and state (t’, p’) of TP, there must be a value p of the prophecy variable(s), such that ((t,p), p (t’, p’)) is a step of TP. • A value assigned to an existing component must not depend on the value of a history variable (e.g., return values). • If t is an initial state of T and (t,p) is a state of TP, then (t,p) must be an initial state of TP
Example 3: Reliable Messages MODULE ReliableMsg [M] EXPORT Put, Get, Crash = VAR q : SEQ M := {} APROC Put(m) = << q + := {m} >> APROC Get() -> M = <<VAR m q.head | q := q.tail; RET m >> APROC Crash() = << VAR q’ | subseq(q’, q) => q = q’>> • Problem: don’t know which messages will be lost at the time of a crash • ensure FIFO delivery • eliminate duplicates from retransmission
Example 4: Queue • Given a queue with operations • Enq: add an element to the back of the queue • Deq: remove the element at the front of the queue and return it • (Abbreviated E and D on the next slide) • (Two processes, A and B)
Some Queue Histories q.E(x) A q.D(y) A q.E(z) A • History 1, acceptable • History 2, not acceptable • History 3, not acceptable q.E(y) B q.D(x) B q.E(x) A q.D(y) A q.E(y) B q.E(x) A q.D(y) A q.E(y) B q.D(y) B
Example 4A: Queue with Locks • Given a queue implementation containing • integers, back, front • an array of values, items • a lock, l Enq = proc (q: queue, x: item) // ignoring buffer overflow lock(l) i: int = q.back++ // allocate new slot q.items[i] = x // fill it unlock(l) Deq = proc (q: queue) returns (item) signals empty lock(l) if (back==front) signal empty else front++ ret: item = items[front] unlock (l) return(ret)
Some Queue Histories • History 1, acceptable • Process A got the lock first during first Enq’s • Why didn’t A return immediately after releasing the lock? q.E(x) A q.D(y) A q.E(z) A q.E(y) B q.D(x) B
Simple Abstraction Function • The abstraction function maps the elements in items items[front…back] to the abstract queue value • Proof is straightward • The lock prevents the “interesting” cases
Example 4B: Queue with Atomic Ops • Given a queue implementation containing • an integer, back • an array of values, items Enq = proc (q: queue, x: item) i: int = INC(q.back) // allocate new slot, atomic STORE(q.items[i], x) // fill it Deq = proc (q: queue) returns (item) while true do range: int = READ(q.back) - 1 for i: int in 1.. range do x: item = SWAP (q.items[i], null) if x!= null then return x
Queue Example Notes • Several atomic operations are defined • STORE, SWAP, INC • These may or may not be supported on given hardware, which would change the proof • The deq operation starts at the front end of the queue • slots already dequeued will show up as nulls • slots not yet filled will also be nulls • picks the first non-empty slot • will repeat scan until it finds an element, waiting for an enqueue to happen if necessary • Many inefficiencies, such as the lack of a head pointer. Example is to illustrate proof technique.
Need for Prophecy Variables • Abstraction function • for this example, a prophecy variable is needed • Two processes, A and B (1 implicit queue) Enq(x) A Enq(y) B INC(q.back) A for this execution, there is no way of defining an abstraction function without INC(q.back) B predicting the future, I.e., whether x or y will be dequeued first STORE(q.items[2], y) B Enq(y) returns on B
Existence of Abstraction Functions There are three cases that arise in trying to prove that an abstraction function exist in a concurrent system • The function can be defined directly on the implementation state • A history variable needs to be added to the implementation to record a past event • A prophecy variable needs to be added to the implementation to record a future event • Alternatively, you may use an abstraction relation A(t) A(t’) t t’
General Discussion • Examples of distributed programs that are challenges to specify • Bayou consistency model • Oceanstore meta-consistency model • Others • Criteria: when is a spec good enough? • Examples of algorithms hard to verify • Examples of programs hard to verify