400 likes | 524 Views
logic, models & analysis. daniel jackson static analysis symposium ·santa barbara · june 2k. my green eggs and ham. two languages in any analysis first order relational logic models in their own right. plan of talk. Alloy, a RISC notation models of software analysis reduced to SAT
E N D
logic,models& analysis daniel jacksonstatic analysis symposium ·santa barbara · june 2k
my green eggs and ham • two languages in any analysis • first order relational logic • models in their own right 2
plan of talk • Alloy, a RISC notation • models of software • analysis reduced to SAT • finding bugs with constraints 3
an example • model CeilingsAndFloors {domain { Man, Platform }state { ceiling, floor : Man -> Platform! } • // one man’s ceiling is another man’s floorinv { all m: Man | some n: Man - m | m.ceiling = n.floor } • // one man’s floor is another man’s ceilingassert { all m: Man | some n: Man - m | m.floor = n.ceiling }} 4
scalars are singleton sets funcs are first-order(t1 => t2 -> t3)equiv to (t1 x t2 -> t3) missing:(t1 -> t2) -> t3 kernel: type decls • d decls, x typexps, t types • d ::= v : xx ::= t | t -> t | t => x • sample decls File, Dir, Root : Object dir : Object => Name -> Object entries : Object -> DirEntry name : DirEntry -> Name contents : DirEntry -> Object parent : Object -> Object 5
navigation kernel: expressions • f formulas, e exps, v vars • e ::= e + e | e & e | e - e set ops | ~ e | + e relational ops | e . e image | e [v] application | {v : t | f} comprehension | v • sample exprs Root.~parent & File d.entries.contents n.dir [d] 6
in used forsubset andmembership kernel: formulas • f ::=e in e subset | f && f | !f logic ops | all v : t | f quantification • sample formulas File+Dir-Root in Root.+~parent all d: DirEntry | ! d in d.contents.entries 7
shorthands • declarations • domain {d} declares d : _d • use sets on RHS • multiplicities: + 1, ?1, ! 1 • domain {Object, DirEntry, Name}state { partition File, Dir : Object Root: Dir ! entries: Dir ! -> DirEntry name: DirEntry -> Name ! contents: DirEntry -> Object ! parent (~children) : Object -> Dir ? } 8
more shorthands • quantifiers sole v: t | f some w: t | { v: t | f } in w all x | f all x : d | f where d is inferred domain Q e Q v | v in e • sample invariants // object has at most one parent all o | sole o.parent // root has no parents no Root.parent // all other directories have one parent all d: Dir - Root | one d.parent 9
sample model: intentional naming • INS • Balakrishnan et al, SOSP 1999 • naming scheme based on specs • why we picked INS • naming vital to infrastructure • INS more flexible than Jini, COM, etc • what we did • analyzed lookup operation • based model on SOSP paper & Java code 10
intentional naming • attribute/value pairs city: cambridge • hierarchical specs city: cambridge, building: ne43, room: 524 service: camera, resolution: hi service: printer, postscript: level2 • lookup • database maps spec to set of records • query is set of specs • lookup returns records meeting all specs 11
query database n0 building service building service ne43 camera ne43 camera printer n0 n1 n0 n0 n1 tree representation 12
strategy • model database & queries • characterize by constraints • generate samples • check properties • obvious • no record returned when no attributes match • claims • “wildcards are equivalent to omissions” • essential • additions to DB don’t reduce query results • discuss and refine … 13
alloy model: state • model INS {domain {Attribute, Value, Record}state { Root : fixed Value! valQ : Attribute? -> Value? attQ : Value? -> Attribute valDB : Attribute? -> Value attDB : Value? -> Attribute rec : Value + -> Record lookup : Value -> Record } 14
alloy model: constraints • // Root is not the value of an attributeinv Q1 {no Root.~valQ} • // if query and DB share a leaf value, lookup returns its recordsinv Lookup1 {all v | no v.attQ || no v.attDB -> v.lookup = v.rec} • // adding a record doesn’t reduce resultsassert LookupOK7 {AddRecord -> Root.lookup in Root.lookup'} 15
checking assertions 3 attrs,vals, recs selectscope fixmodel runcheck incrscope counter? Y N N N real? slow? Y Y propfails propholds 16
results • 12 assertions checked • when query is subtree, ok • found known bugs in paper • found bugs in fixes too • monotonicity violated 17
query database n1 service service printer printer size type size type A4 mono A4 mono n0 n1 counterexample 18
time & effort • costs 2 weeks modelling, ~70 + 50 lines Alloy • cf. 1400 + 900 lines code all bugs found in < 10 secs with scope of 4 • 2 records, 2 attrs, 3 values usually enough • cf. a year of use exhausts scope of 5 in 30 secs max • space of approx 10^20 cases 19
other modelling experiences • microsoft COM (Sullivan) • automated & simplified: 99 lines • no encapsulation • air traffic control (Zhang) • collaborative arrival planner • ghost planes at US/Canada border • PANS phone (Zave) • multiplexing + conferencing • light gets stuck 20
why modelling improves designs rapid experimentation articulating essence simplifying design catching showstopper bugs 21
how analyzer works • what you learned in CS 101 • 3-SAT: first NP-c problem • to show a problem is hard • reduce SAT to it • what we know now • SAT is usually easy • to show a problem is easy • reduce it to SAT • key to reduction • consider finite scope: type small scope hypothesis most interesting caseshave illustrationsin small scopes 22
architecture alloyproblem alloyresult scope mapping translateproblem translatesolution booleanformula booleansolution SATsolver 23
p S0 T0 a S1 T1 b example • problem a, b : S p : S -> T ! (a – b).p in (a.p – b.p) • a model in a scope of 2 S = {S0, S1} T = {T0, T1} p = {(S0, T0), (S1, T0)} a = {S0} b = {S1} 24
p S0 T0 a a0 , b1 , p00 , p10 S1 T1 b translation scheme • represent • set as vector of bool var • a [a0 a1] • b [b0 b1] • relation as matrix p [p00 p01 , p10 p11] • translate • set expr to vector of bool formula • XT [a - b]i = XT [a]i XT [b]i • XT [a . b]i = j. XT [a]j XT [b]ji • relational expr to matrix of bool formula • formula to bool formulas 25
translation a [a0 a1] b [b0 b1] p [p00 p01 , p10 p11] a – b [a0 b0 a1 b1] (a – b).p [(a0 b0 p00) (a1 b1 p10) …] a.p [(a0 p00) (a1 p10) (a0 p01) (a1 p11)] b.p [(b0 p00) (b1 p10) (b0 p01) (b1 p11)] a.p – b.p [((a0 p00) (a1 p10)) ((b0 p00) (b1 p10)) …] ! (a – b).p in (a.p – b.p)(((a0b0 p00) (a1b1 p10) ((a0 p00) (a1 p10)) ((b0 p00) (b1 p10)))) … 26
tricks • quantifiers • could expand into conjunctions • but how to make modular? • translate formula into tree indexed on var • avoiding blowup • solvers expect CNF • standard var intro tricks • symmetry • all our domains are uninterpreted • many equivalent assignments • add symmetry-breaking predicates 27
how (not) to delete • class List {List next; Val val;} • void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) { prev.next = p.next ; return; } else { prev = p ; p = p.next ; } 28
specifying delete • basic spec p.*next’ = p.*next – {c | c.val = v} • as Alloy model domain {List, Val} state { • next : List -> List? • val : List -> Val? • p : List? , v : Val? • } op MergeCode { … } op MergeSpec {p.*next’ = p.*next – {c | c.val = v}} assert {MergeCode -> MergeSpec} 29
p val v hacking delete (1) • counter #1: first cell has value v cond Mask {p.val != v} assert {MergeCode && Mask -> MergeSpec} 30
next next p val val val v hacking delete (2) • counter #2: two cells with value v cond RI {all x | sole c: p.*next | c.val = x} assert {MergeCode && Mask && RI -> MergeSpec} assert {MergeCode && RI -> RI’} 31
step 1: unroll control flow graph void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) {prev.next = p.next ; return; } else {prev = p ; p = p.next ; } 32
step 2: encode control flow • E01 -> E12 || E13E13 -> E34 || E36E34 -> E45E45 -> E52E36 -> E67E67 -> E78E78 -> E82 33
step 3: encode dataflow • E36 -> p3.val3 != v3 • E45 ->prev4.next5 = p4.next4 • E78 -> p8 = p7.next7 34
frame conditions • must say what doesn’t change • so add p6 = p7 • but • don’t need a different p at each node • share vars across paths • eliminates most frame conditions 35
sample results • on Sagiv & Dor’s suite of small list procedures • reverse, rotate, delete, insert, merge • wrote partial specs (eg, set containment on cells) • predefined specs for null deref, cyclic list creation • anomalies found • 1 unrolling • scope of 1 • < 1 second • specs checked • 3 unrollings • scope of 3 • < 12 seconds 36
promising? • nice features • expressive specs • counterexample traces • easily instrumented • compositionality • specs for missing code • summarize code with formula • analysis properties • code formula same for all specs • exploit advances in SAT 37
summary • Alloy, a tiny logic of sets & relations • declarative models, not abstract programs • analysis based on SAT • translating code to Alloy • challenge • checking key design properties • global object model invariants • looking at CTAS air-traffic control • abstraction, shape analysis …? 38
related work • checking against logic • Sagiv, Reps & Wilhelm’s PSA • Extended Static Checker • using constraints • Ernst, Kautz, Selman & co: planning • Biere et al: linear temporal logic • Podelski’s array bounds • extracting models from code • SLAM’s boolean programs • Bandera’s automata 39
You do not like them.So you say.Try them! Try them!And you may.Try them and you may, I say. sdg.lcs.mit.edu/alloy 40