1 / 57

Ranjit Jhala Rupak Majumdar

Bit -level Types. for. High -level Reasoning. Ranjit Jhala Rupak Majumdar. The Problem. mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }.

aimee
Download Presentation

Ranjit Jhala Rupak Majumdar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bit-level Types for High-level Reasoning Ranjit Jhala Rupak Majumdar

  2. The Problem mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } • Bit-level operators in low-level systems code • Why ? • Interact with hardware • Reduce memory footprint

  3. The Problem mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } • Bit-level operators in low-level systems code • Inscrutableto humans, optimizers, verifiers

  4. 31 1 p Whats going on ? 32 mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  5. 20 11 1 31 1 p pte Whats going on ? 20 mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 12 20

  6. 20 11 1 p pte tab[pte] Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 12 20 32

  7. 12 10 2 20 11 1 20 10 1 1 p pte o b 30 2 Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 12 20

  8. 20 10 2 20 10 1 1 p pte o b Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 12 20 30 2

  9. 20 10 2 20 10 1 1 p pte o b Q: How to infer complex information flow to understand, optimize, verify code ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 12 20 30 2

  10. Plan • Motivation • Approach

  11. 20 10 2 20 10 1 1 12 20 b : {addr,30}{;,2} p : {idx,20}{addr,10}{wr,1}{rd,1} o : {;,20}{addr,10}{;,2} pte : {;,12}{idx,20} p 30 2 pte b o Our approach: (1) Bit-level Types Bit-level Types Sequences of {name,size} pairs

  12. 20 10 2 20 10 1 1 12 20 b : {addr,30}{;,2} p : {idx,20}{addr,10}{wr,1}{rd,1} p o : {;,20}{addr,10}{;,2} pte : {;,20}{idx,10} 30 2 pte o b Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){

  13. 20 10 2 20 10 1 1 12 20 b : {addr,30}{;,2} p : {idx,20}{addr,10}{wr,1}{rd,1} p o : {;,20}{addr,10}{;,2} pte : {;,20}{idx,10} 30 2 pte o b Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){

  14. 20 10 2 20 10 1 1 12 20 p : {idx,20}{addr,10}{wr,1}{rd,1} o : {;,20}{addr,10}{;,2} p pte : {;,20}{idx,10} b : {addr,30}{;,2} 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx;

  15. 20 10 2 20 10 1 1 12 20 b : {addr,30}{;,2} p : {idx,20}{addr,10}{wr,1}{rd,1} p o : {;,20}{addr,10}{;,2} pte : {;,20}{idx,10} 30 2 pte o b Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx;

  16. 20 10 2 20 10 1 1 12 20 o : {;,20}{addr,10}{;,2} p b : {addr,30}{;,2} pte : {;,20}{idx,10} p : {idx,20}{addr,10}{wr,1}{rd,1} 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr;

  17. 20 10 2 20 10 1 1 12 20 pte : {;,20}{idx,10} b : {addr,30}{;,2} p p : {idx,20}{addr,10}{wr,1}{rd,1} o : {;,20}{addr,10}{;,2} 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr;

  18. 20 10 2 20 10 1 1 12 20 pte : {;,20}{idx,10} p : {idx,20}{addr,10}{wr,1}{rd,1} o : {;,20}{addr,10}{;,2} p b : {addr,30}{;,2} 30 2 pte o b Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr; o.addr=p.addr;

  19. 20 10 2 20 10 1 1 12 20 pte : {;,20}{idx,10} b : {addr,30}{;,2} p p : {idx,20}{addr,10}{wr,1}{rd,1} o : {;,20}{addr,10}{;,2} 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr; o.addr=p.addr;

  20. 20 10 2 20 10 1 1 12 20 o : {;,20}{addr,10}{;,2} b : {addr,30}{;,2} pte : {;,20}{idx,10} p : {idx,20}{addr,10}{wr,1}{rd,1} p 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr; o.addr=p.addr; return m[b.addr+o.addr];

  21. 20 10 2 20 10 1 1 12 20 p : {idx,20}{addr,10}{wr,1}{rd,1} p b : {addr,30}{;,2} pte : {;,20}{idx,10} o : {;,20}{addr,10}{;,2} 30 2 pte b o Our approach: (2) Translation Expressions ! Records Bit-ops ! Field accesses mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr; o.addr=p.addr; return m[b.addr+o.addr];

  22. Our approach Low-level operations eliminated bit-level types + translation mget(p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; b.addr=tab[pte.idx].addr; o.addr=p.addr; return m[b.addr+o.addr]; Program can be understood, optimized, verified

  23. Plan • Motivation • Approach • Bit-level types + Translation • Key: Bit-level type Inference • Experiences • Related work

  24. Constraint-based Type Inference Alice’s age: a Bob’sage:b = 22 = 54 Algorithm: 0. Variables for unknowns 1. Generate constraints on vars 2. Solve constraints 2a = b– 10 b = 2006 - 1952 Remember these: If Alice doubles her age, she would still be 10 years younger than Bob, who was born in 1952. How old are Alice and Bob ?

  25. Constraint-based Type Inference Algorithm: 0. Variables for unknown • bit-level types of all program expressions • Generate constraints on vars • Solve constraints

  26. Plan • Motivation • Approach • Bit-level types + Translation • Key: Bit-level type Inference • Constraint Generation • Constraint Solving • Experiences • Related work

  27. Constraint Generation Type variables for eachexpression: p p p&0x1 p&0x1 pte pte   mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  28. Generating Zero Constraints Mask: p&0xFFC[31:12] = ; p&0xFFC[1:0] = ; 020 02 31 12 1 0 mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  29. 012 Generating Zero Constraints Shift: e>>12[31:20]= ; e is p&0xFFFFF000 31 20 mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  30. Inequality constraint x ¸e Why are zeros special ? x = e Consider assignment (value flowse to x) Should x and e have same bit-level type? K +  x · K  e Common idiom: k-bit values special case of k+-bit values • Equality results in unnecessary breaks • Zeros enable precise subtyping subtypes(·)

  31. Generating Inequality Constraints Mask: p&0xFFC[11:2]¸p[11:2] 020 02 11 2 mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  32. 012 Generating Inequality Constraints e Shift: e>>12[19:0] ¸ e[31:12] 12 mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; } 31 e>>12 19 0

  33. Generating Inequality Constraints Assignment: o¸ p&0xFFC that is… o[31:0]¸p&0xFFC[31:0] mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p&0xFFFFF000)>>12; b = tab[pte]&0xFFFFFFFC; o = p&0xFFC; return m[(b+o)>>2]; }

  34. Plan • Motivation • Approach • Bit-level types + Translation • Key: Bit-level type Inference • Constraint Generation • Constraint Solving • Experiences • Related work

  35. 20 10 1 1 A(p)= {idx,20}{addr,10}{wr,1}{rd,1} Constraint Solutions Solution is an assignment • A: type variables ! bit-level types A()[i:j] = subsequence of A() from bit i through j 31 12 5 1 2 • A(p)[12:1] = {addr,10}{wr,1} • A(p)[31:2] = {idx,20}{addr,10} • A(p)[31:5] = undefined

  36. Constraint Solving Overview Solution is an assignment • A: type variables ! bit-level types A([i:j]) = subsequence from bit i through j A satisfies: • zero Constraint : [i:j] = ; • If A()[i:j] = ;i-j+1 • inequality Constraint: [i:j] ·’[i’:j’] • If A()[i:j] · A(’)[i’:j’] • In both cases, A()[i:j]must be defined

  37. Constraint Solving Algorithm Input: Zero constraints {z_1,…,z_m} Inequality constraints {c1,…,cn} Output: Assignment satisfying all constraints A0 = Initial asgn satisfying zero constraints (details in paper) A = A0 foriin[1…n]: A = refine(A,ci) return A • refine(A,ci) adjusts A such that: • ci becomes satisfied • earlier constraints stay satisfied • built using Split, Unify

  38. 12 12 e, f,12 Refine: Split(A,,k) Throughout A, substitute: p,12 +  A() p,32 A’ = Split(A,,12) and substitute: p,12- A’() e,20 f,12 f,12- where e , f are fresh

  39. 2 11+1 Refine: Split(A,,k) • Used toensure A()[i:j] is defined Ensure A()[11:2] is defined A() p,32 A’ = Split(A,,12) 11 A’() e,20 f,12 A’’ = Split(A’,,2) 11 2 A’’() e,20 g,10 h,2 A’’()[11:2] defined

  40. Refine: Unify(A,p,q) Throughout A, substitute: p, q,

  41. 0 19 31 12 A’(’) s : 12 t : 20 A’() ;:10 q :10 r : 12 0 19 31 12 A’’(’) t : 32 t : 32 A’’() ;:10 t :10 r : 12 A’’ satisfies constraint Refine(A, [31:12] ·’[19:0]) 0 19 A(’)[19:0] undefined 31 12 A(’) p : 32 A() ;:10 q :10 r :12 A’ = Split(A,’,19+1) A’(’)[19:0] · A’()[31:12] A’’ = Unify(A’,q,t)

  42. Constraint Solving Input: Constraints Output: Assignment satisfying all constraints A = A0 foriin[1…n]: A = refine(A,ci) return A Substitution (in Split, Unify) • ensures earlier constraints stay satisfied • most general solution found • Efficiently implemented using graphs

  43. Plan • Motivation • Approach • Bit-level types + Translation • Key: Bit-level type Inference • Constraint Generation • Constraint Solving • Experiences • Related work

  44. Experiences Implemented bit-level type inference for C • pmap: a kernel virtual memory system • Implements the code for our running example • mondrian: a memory protection system • scull: a linux device driver (1-3 Kloc) • Inference/Translation takes less than 1s

  45. Mondrian [Witchel et. al.] • Bit packing for memory and permission bits • 2600 lines of code, generated 775 constraints • Translated to program without bit-operations • 18 different bit-packed structures • 10 assertions provided by programmer • After translation, assertions verified using BLAST • 6 safe: all require bit-level reasoning • Previously, verification was not possible • 4 false positives: imprecise modeling of arrays

  46. Cop outs (i.e. Future Work) • Truly binary bit-vector operations • x << y, x && y • Currently: Value-flow analysis to infer constants flowing to y Break into a switch statement • Flow-sensitivity • Currently: SSA renaming • Arithmetic overflow • does a k-bit value “spill over” • Currently: Assume no overflow • Path-sensitivity (value dependent types) • Type of suffix depends on value of first field • e.g. Instruction decoder for architecture simulator • Number/type of operands depends on opcode

  47. Plan • Motivation • Approach • Bit-level types + Translation • Key: Bit-level type Inference • Constraint Generation • Constraint Solving • Experiences • Related work

  48. Related Work • O Callahan – Jackson [ICSE 97] • Type Inference • Gupta et. al. [POPL 03, CC02] • Dataflow analyses for packing bit-sections • Ramalingam et. al. [POPL 99] • Aggregate structure inference for COBOL

  49. Conclusions • (Automatic) reasoning about Bit-operations hard • Structure: bit-operations pack data into one word • Structure Inferred via Bit-level Type Inference • Structure Exploited via Translation to fields • Precise, efficient reasoning about Bit-operations

  50. Thank you

More Related