410 likes | 536 Views
CS51: References and Mutation. Greg Morrisett. Mutation?. Thus far…. We have considered the (almost) purely functional subset of Ocaml . We’ve had a few side effects: printing & raising exceptions. Two reasons for this emphasis: Reasoning about functional code is easier.
E N D
CS51: References and Mutation Greg Morrisett
Thus far… • We have considered the (almost) purely functional subset of Ocaml. • We’ve had a few side effects: printing & raising exceptions. • Two reasons for this emphasis: • Reasoning about functional code is easier. • Both formal reasoning (a la the substitution model) and informal reasoning. • In part, because anything you can prove stays true. • e.g., 3 is a member of set S. • This is because the data structures are persistent. • They don’t change – we build new ones and let the garbage collector reclaim the unused old ones. • To convince you that you don’t need side effects for many things where you previously thought you did. • e.g., there’s no need for a loop to have a mutable counter that we update each time. • You can implement functional data structures like 2-3 trees with reasonable space and time.
But alas… • Purely functional code is pointless. • The whole reason we write code is to have some effect on the world. • For example, the Ocaml top-level loop prints out your result. • Without that printing (a side effect), how would you know that your functions computed the right thing? • Some algorithms or data structures need mutable state. • Example: hash-tables have (essentially) constant-time access and update. • The best functional dictionaries have either: • logarithmic access & update • constant access & linear update • constant update & linear access • Don’t forget that we give up something for this: we can’t go back and look at previous versions of the dictionary. We can do this in a functional setting. • Another Example: Robinson’s unification algorithm • A critical part of the Ocaml type-inference engine. • Also used in other kinds of program analyses.
Still: • Reasoning about mutable state (and side effects more generally) is much harder. • Types don’t help much. • These effects don’t show up in interfaces. • If I insert x into S, call a function f, and then ask member(x,S) will it evaluate to true? • I need to go off and look at f to determine if it modifies S. • Worse, in a concurrent setting, I need to look at every other function that any other thread may be executing to see if it modifies S. • Sharing mutable state is fraught with peril. • The “aliasing” problem. • Moral: use mutable data structures only where necessary. • This will also be true when you use Java or C/C++ or Python or … • It’s harder to respect this abstraction boundary in non-functional languages. • But like all abstraction boundaries, it pays dividends down the road.
References • New type: t ref • Think of it as a pointer to a box that holds a t value. • The pointer can be shared. • The contents of the box can be read or written. • To create a fresh box: ref 42 • allocates a new box, initializes its contents to 42, and returns a pointer to that box. • so similar to: int *r = malloc(sizeofint) ; *r = 42 ; return r; • To read the contents: !r • if r points to a box containing 42, then returns 42. • similar to *r in C/C++ • To write the contents: r := 42 • updates the box that r points to so that it contains 42. • similar to *r = 42 in C/C++
Example let c = ref 0 ;; let x = !c ;; (* x will be 0 *) c := 42 ;; let y = !c ;; (* y will be 42. x will still be 0! *)
Another Example let c = ref 0 ;; let next() = let v = !cin (c := v+1 ; v)
Another Example A semicolon is used to sequence expressions. let c = ref 0 ;; let next() = let v = !cin (c := v+1 ; v)
Another Example So this updates c and then returns v. let c = ref 0 ;; let next() = let v = !cin (c := v+1 ; v)
Equivalent to: let c = ref 0 ;; let next() = let v = !cin let _ = c := v+1 in v
Equivalent to: let c = ref 0 ;; let next() = let v = !cin let dummy : unit = c := v+1 in v
Another Example let c = ref 0 ;; let next() = let v = !cin (c := v+1 ; v) next() ;; (* returns 0 *) next() ;; (* returns 1 *) (next()) + (next()) ;; (* returns ??? *)
Aliasing let c = ref 0 ;; let x = c ;; c := 42 ;; !x ;;
Aliasing 0 let c = ref 0 ;; let x = c ;; c := 42 ;; !x ;;
Aliasing 0 let c = ref 0 ;; let x = c ;; c := 42 ;; !x ;;
Aliasing 42 let c = ref 0 ;; let x = c ;; c := 42 ;; !x ;;
Aliasing 42 let c = ref 0 ;; let x = c ;; c := 42 ;; !x ;; (* evaluates to 42 *)
Imperative Stacks module type IMP_STACK = sig type ‘a stack valempty : unit -> ‘a stack valpush : ‘a -> ‘a stack -> unit valpop : ‘a stack -> ‘a option end
Imperative Stacks When you see “unit” as the return type, you know the function is being executed for its side effects. (Like void in C/C++/Java.) module type IMP_STACK = sig type ‘a stack valempty : unit -> ‘a stack valpush : ‘a -> ‘a stack -> unit valpop : ‘a stack -> ‘a option end
Imperative Stacks Unfortunately, we can’t always tell from the type that there are side-effects going on. It’s a good idea to document them explicitly. module type IMP_STACK = sig type ‘a stack valempty : unit -> ‘a stack valpush : ‘a -> ‘a stack -> unit valpop : ‘a stack -> ‘a option end
Imperative Stacks module ImpStack : IMP_STACK = struct type ‘a stack = (‘a list) ref let empty() : ‘a stack = ref [] let push(x:’a)(s:’a stack) : unit = s := x::(!s) let pop(s:’a stack) : ‘a option = match !swith | [] -> None | h::t -> (s := t ; Some h) end
Imperative Queues module type IMP_QUEUE = sig type ‘a queue valempty : unit -> ‘a queue valenq : ‘a -> ‘a queue -> unit valdeq : ‘a queue -> ‘a option end
Imperative Queues module ImpQueue : IMP_QUEUE = struct type ‘a queue = {front:‘a list ref ; rear :’a list ref} let empty() = {front=ref[] ; rear=ref[]} let enqxq = q.rear := x::(!(q.rear)) let deqq = match !(q.front) with | h::t -> (q.front := t; Some h) | [] -> (match rev (!(q.rear)) with | h::t -> (q.front := t; Some h) | [] -> None) end
Alternative? module ImpQueue2 : IMP_QUEUE = struct type ‘a mlist = Nil | Cons of ‘a * ((‘a mlist) ref) type ‘a queue = {front:’amlist ref ; rear :’a mlist ref} let empty() = {front=ref Nil ; rear=ref Nil} let enqxq = match !q.rearwith | Cons(h,t) -> (t := Cons(x,ref Nil) ; q.rear := !t) | Nil -> (q.front := Cons(x,ref Nil); q.rear:= !(q.front)) let deqq = match !q.frontwith | Cons(h,t) -> (q.front := !t ; Some h) | Nil -> None end
Better module ImpQueue2 : IMP_QUEUE = struct type ‘a mlist = Nil | Cons of ‘a * ((‘a mlist) ref) type ‘a queue = {front:’amlist ref ; rear :’a mlist ref} let empty() = {front=ref Nil ; rear=ref Nil} let enqxq = match !q.rearwith | Cons(h,t) -> (assert (!t = Nil); t := Cons(x,ref Nil) ; q.rear := !t) | Nil -> (assert (!(q.front) = Nil); q.front := Cons(x,ref Nil); q.rear = !(q.front)) let deqq = match !q.frontwith | Cons(h,t) -> (q.front := t ; (match !twith Nil -> q.rear := Nil | Cons(_,_) -> ()) ; Some h) | Nil -> None end
Mutable Lists type ‘a mlist = Nil | Cons of ‘a * ((‘a mlist) ref) let reclength(m:’amlist) : int = match mwith | Nil -> 0 | Cons(h,t) -> 1 + length(!t)
Fraught with Peril type ‘a mlist = Nil | Cons of ‘a * ((‘a mlist) ref) let recmlength(m:’amlist) : int = match mwith | Nil -> 0 | Cons(h,t) -> 1 + length(!t) let r = ref Nil ;; let m = Cons(3,r) ;; r := m ;; mlengthm ;;
Another Example: let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys)
Mutable Append Example: 1 2 3 4 5 6 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let xs = Cons(1,ref (Cons 2, ref (Cons 3, ref Nil))) ;; let ys = Cons(4,ref (Cons 5, ref (Cons 6, ref Nil))) ;; mappendxsys ;;
Mutable Append Example: xs 1 2 3 ys 4 5 6 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let as = Cons(1,ref (Cons 2, ref (Cons 3, ref Nil))) ;; let bs = Cons(4,ref (Cons 5, ref (Cons 6, ref Nil))) ;; mappend as bs ;;
Mutable Append Example: xs 1 2 3 ys 4 5 6 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let as = Cons(1,ref (Cons 2, ref (Cons 3, ref Nil))) ;; let bs = Cons(4,ref (Cons 5, ref (Cons 6, ref Nil))) ;; mappend as bs ;;
Mutable Append Example: xs 1 2 3 ys 4 5 6 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let as = Cons(1,ref (Cons 2, ref (Cons 3, ref Nil))) ;; let bs = Cons(4,ref (Cons 5, ref (Cons 6, ref Nil))) ;; mappend as bs ;;
Mutable Append Example: xs 1 2 3 ys 4 5 6 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let as = Cons(1,ref (Cons 2, ref (Cons 3, ref Nil))) ;; let bs = Cons(4,ref (Cons 5, ref (Cons 6, ref Nil))) ;; mappend as bs ;;
Another Example: let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := y | Cons(_,_) asm -> mappendmys) let m = Cons(1,ref Nil);; mappendmm ;; mlengthm ;;
Mutable Append Example: xs ys 1 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let m = Cons(1,ref Nil);; mappendmm ;;
Mutable Append Example: xs ys 1 let recmappendxsys = match xswith | Nil -> () | Cons(h,t) -> (match !twith | Nil -> t := ys | Cons(_,_) asm -> mappendmys) ;; let m = Cons(1,ref Nil);; mappendmm ;;
Morals • Mutable data structures can lead to asymptotic efficiency improvements. • e.g., our example queues with worst-case constant-time enq/deq • But they are much harder to get right. • mostly because we must think about aliasing. • updating in one place may actually have a big effect on other places. • writing and enforcing invariants becomes more important. • e.g., assertions we used in the queue example • cycles in data structures aren’t a problem until we introduce refs. • must write operations much more carefully to avoid looping • we haven’t even gotten to the multi-threaded part. • So use refs when you must, but try hard to avoid it.
Is it possible to avoid all state? • Yes! (in single-threaded programs) • We can pass in the old values of references and return their new values. • Consider the difference between our functional stacks and our imperative ones: • fnl_push : ‘a -> ‘a stack -> ‘a stack • imp_push : ‘a -> ‘a stack -> unit • In general, we can pass in to and out of each function a dictionary that records the current values of references. • But then accessing or updating a reference takes O(lgn) time. • In a world where we do updates in place, we get O(1) time. • (actually only true if you think memory access has constant time.) • In practice, doesn’t solve the problems of aliasing and local reasoning.
How/when to use state? • In general, I try to write the functional version first. • e.g., prototype • don’t have to worry about sharing and updates • don’t have to worry about race conditions • replacement is easy (substitution model!) • Sometimes you find you can’t afford it, for big-O reasons. • example: routing tables need to be fast in a switch • constant time lookup, update (hash-table) • When I do use state, I try to encapsulate it behind an interface. • must think explicitly about sharing and invariants • write these down, write assertions to test them • if encapsulated in a module, these tests can be localized
Arrays and Mutable Fields • In addition to refs, Ocaml provides arrays and mutable fields within records. • in fact a T ref is just a {mutable contents : T} record. • For arrays, we have: • A.(i) to read the ith element of the array A • A.(i) <- 42 to write the ith element of the array A • Array.make : int -> ‘a -> ‘a array • Array.make 42 ‘x’ creates an array of length 42 with all elements initialized to the character ‘x’. • See the reference manual for more operations.