390 likes | 472 Views
Typing Local Control and State Using Flow Analysis. Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. why add types?. JavaScript [Anderson ’05, Heidegger ‘09 ] LISP [Reynolds ’68, Cartwright ’75] Ruby [ Furr ’09 ] Scheme [Tobin- Hochstadt ‘06 ] Smalltalk [Suzuki ’80 ]
E N D
Typing Local Control and State Using Flow Analysis Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi
why add types? JavaScript [Anderson ’05, Heidegger ‘09] LISP [Reynolds ’68, Cartwright ’75] Ruby [Furr ’09] Scheme [Tobin-Hochstadt ‘06] Smalltalk [Suzuki ’80] Thorn [Bloom ’09] etc.
why we want types for JavaScript documentation verifying security-critical code Caja FBJS
Union of the two function fromOrigin(p) { if (typeof p === "object") { return Math.sqrt(p.x * p.x + p.y * p.y); } else { return Math.abs(p); } } Object with numericx and y fields Number
T = Num | Bool| String | Undef | T T | Any | T … T-> T | {field: T, …} Explicittypeannotations(primarily onfunctions)
p: {x: Num, y: Num} Num function fromOrigin(p) { if (typeof p === "object") { return Math.sqrt(p.x * p.x + p.y * p.y); } else { return Math.abs(p); } } p: {x: Num, y: Num} p: Num
Control Operators if, ?: while, for, do, for in switch, continue, break, label: &&, ||, throw, try catch finally
Any String Bool if (val === null) { return "null"; } var fields = [ ]; for (var p in val) { var v = serialize(val[p]); if (typeof v === "string") { fields.push(p + ": " + v); } } return "{ " + fields.join(", ") + " }"; } function serialize(val) { switch (typeofval) { case "undefined": case "function": return false; case "boolean": return val ? "true" : "false"; case "number": return "" + val; case "string": return val; } where is case "object"?
Any String Bool if (val === null) { return "null"; } var fields = [ ]; for (var p in val) { var v = serialize(val[p]); if (typeof v === "string") { fields.push(p + ": " + v); } } return "{ " + fields.join(", ") + " }"; } function serialize(val) { switch (typeofval) { case "undefined": case "function": return false; case "boolean": return val ? "true" : "false"; case "number": return "" + val; case "string": return val; } implicitcase "object"
var slice = function (arr, start, stop) { var result = []; for (var i = 0; i <= stop - start; i++) { result[i] = arr[start + i]; } return result; } slice([5, 7, 11, 13], 0, 2) • [5, 7, 11] slice([5, 7, 11, 13], 2) arity mismatch error?
arr: Array<a> start: Num stop: Num Undef stop: Undef var slice = function (arr, start, stop) { if (typeof stop === "undefined") { stop = arr.length – 1; } var result = []; for (var i = 0; i <= stop - start; i++) { result[i] = arr[start + i]; } return result; } slice([5, 7, 11, 13], 0, 2) • [5, 7, 11] slice([5, 7, 11, 13], 2) [11, 13] stop: Num stop: Num
Moral “Scripting language” programmersuse state and non-trivial control flowto refine types
Soundness [Preservation] If e : T and e e’, e’ : T [Progress] If e : T, • e is a value, or • e’ . e e’ This is true… but not very useful! Proving this needs a semantics. "The Essence of JavaScript" (ECOOP 2010)
This is nottypeable! function fromOrigin(p) { /* {x: Num, y: Num} Num -> Num */ if (typeof p === "object") { return Math.sqrt(p.x * p.x + p.y * p.y); } else { return Math.abs(p); } }
function fromOrigin(p) { /* {x: Num, y: Num} Num -> Num */ if (typeof p === "object") { var pt = cast p …; return Math.sqrt(pt.x * pt.x + pt.y * pt.y); } else { varpf = cast p …; return Math.abs(pf); } }
Casting Casting is an operation between types But JavaScript (and Scheme, Python, …)have only tags Tag = "number" | "string" | … | {field: T, …} "object" | T … -> T"function"
function fromOrigin(p) { /* {x: Num, y: Num} Num -> Num */ if (typeof p === "object") { var pt = cast p …; return Math.sqrt(pt.x * pt.x + pt.y * pt.y); } else { varpf = cast p …; return Math.abs(pf); } } Should really becalled tagof…
function fromOrigin(p) { /* {x: Num, y: Num} Num -> Num */ if (typeof p === "object") { var pt = tagcheck(set("object"), p); return Math.sqrt(pt.x * pt.x + pt.y * pt.y); } else { varpf = tagcheck(set("number"), p); return Math.abs(pf); } }
Introducing tagcheck tagcheck R e Reduce e to value v Let t be the typeof (i.e., the tag of) v If t is in R, return v; else error (tagerr) Set of tags Anexpression
runtime : Type 2Tag static : 2Tag Type static : 2Tag Type Type
static : 2Tag Type Type • static(set("string", "bool"), • Str Num Bool) • = Str Bool • static(set("object"),{x: Num, y: Num} Num)) • = {x: Num, y: Num} Given a set of tags… …and a type… …pick out the partsof the type that correspond to the tags
narrow type based on these two… Determine static type, identify presumed run-time tags, …that’s the resulting static type
static(set("string"),NumStr) NumStr set("string") Str
function fromOrigin(p) { /* {x: Num, y: Num} Num -> Num */ if (typeof p === "object") { var pt = tagcheck(set("object"), p); return Math.sqrt(pt.x * pt.x + pt.y * pt.y); } else { varpf = tagcheck(set("number"), p); return Math.abs(pf); } }
tagcheck Failure Modes tag set R is inconsistent with the type S actual run-time tag is simply notcontained in R resulting type T isinconsistent withthe context’s needs
Soundness [Preservation] If e : T and e e’, e’ : T [Progress] If e : T, • e is a value, or • e’ . e e’, or • e= E[tagerr ]
How to prevent programs from resulting in the new run-time error (tagerr)? Who’s going to write tagchecks?
Flow Analysis Value sets are about tags, not values Only inside procedures; halts at calls Intermediate representation (CPS, ANF, SSA, …)
p = {"object", "number"} t1 = typeof p t2 = t1 === "object" p = runtime({ x: Num, y: Num}Num) p = {"object", "number"} function fromOrigin(p) { /* {x: Num, y: Num } Num -> Num*/ var t1 = typeof p; var t2 = (t1 === "object"); if (t2) { var t3 = p.x; var t4 = p.x; var t5 = t3 * t4; var t6 = p.y; var t7 = p.y; var t8 = t6 * t7; var t9 = t5 + t8; return Math.sqrt(t9); } else { return Math.abs(p); } } p = {"object"} t1 = typeof p t2 = true p = {"number"} t1 = typeof p t2 = false
function fromOrigin(p) { /* {x: Num, y: Num } Num -> Num*/ var t1 = typeof p; var t2 = (t1 === "object"); if (t2) { var t3 = p.x; var t4 = p.x; var t5 = t3 * t4; var t6 = p.y; var t7 = p.y; var t8 = t6 * t7; var t9 = t5 + t8; return Math.sqrt(t9); } else { return Math.abs(p); } } p = {"object"} t1 = typeof p t2 = true p = {"number"} t1 = typeof p t2 = false
function fromOrigin(p) { /* {x: Num, y: Num } Num -> Num*/ var t1 = typeof p; var t2 = (t1 === "object"); if (t2) { varpt = tagcheck(set("object"), p); var t3 = pt.x; var t4 = pt.x; var t5 = t3 * t4; var t6 = pt.y; var t7 = pt.y; var t8 = t6 * t7; var t9 = t5 + t8; return Math.sqrt(t9); } else { varpf= tagcheck(set("number"), p); return Math.abs(pf); } } p = {"object"} t1 = typeof p t2 = true p = {"number"} t1 = typeof p t2 = false
p = runtime({ x: Num, y: Num}Num) = {"object", "number"} function fromOrigin(p) { /* { x: Num, y: Num }Num -> Num */ ... } fromOrigin(500) fromOrigin({ x : 20, y: 900 }) fromOrigin("invalid argument") actual arguments ignored by flows therefore, invalid arguments ignored too!
[Flow Soundness] If e : ok and e e’, • e’ : ok, or • e is a βv redex (func(x) : T … { return e })(v) tagof(v) runtime(T) Combined Soundness for Types and Flows [Type Pres.] If e : T and e e’, e’ : T [Type Prog.] If e : T, • e is a value, or • e’ . e e’, or • e= E[tagerr]
Our Recipe Simple type checker (not quite enough) Add tagcheck (breaks progress) Simple flow analysis (w/ preservation broken) “Types on the outside, flows on the inside” The composition is sound The performance is great (seconds on netbook)
Verifying Web Sandboxes Types as Documentation for JavaScript programs Typing Objects (in progress) Typing Local Control and State (ESOP 2011) JavaScript Semantics (ECOOP 2010)