260 likes | 409 Views
Type-Based Verification of Assembly Language for Compiler Debugging. Bor-Yuh Evan Chang Adam Chlipala George C. Necula Robert R. Schneck University of California, Berkeley TLDI 2005 Long Beach, California January 10, 2005. Problem and Motivation.
E N D
Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan Chang Adam Chlipala George C. Necula Robert R. Schneck University of California, Berkeley TLDI 2005 Long Beach, California January 10, 2005
Problem and Motivation • Type check assembly code produced by existing compilers for a Java-like language • Validate that certifying compilation aids compiler debugging • using undergraduate compilers course as a test bed • Introduce undergraduates to using compiler technology for software quality Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Problem and Motivation • Starting Point: Bytecode verification • Effective, simple, and very accessible • But does not work for assembly code • Why assembly code? • Native code compilers are more complex • Also debug JITs • Why existing compilers? • Force to make the verifier flexible • Better accessibility to students (i.e., require little or no annotations) • More test cases to validate “certified compilation for debugging” Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Compiled Program Test Inputs Ð^K^V^H^P^L^V^H0^L^V^Hp^L^V^H°^L^V^Hð^L^V^H^P^M^V^H0^M^V^Hp^M^V^H^Ð^M^V^Hð^M^V^ Compiler Test Case Verify Run Compile Line 2147: Type error – argument -8(SP0) : unknown, which is not <= nonnull String. Relaxed Compilers Student Debugging Compilers with Verification Compiler Test Case Run Compile Stressed Compilers Student Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() Challenges invokevirtual P.m() branch (= rc 0) Labort rtmp := m[rc + 8] rtmp := m[rtmp + 12] rarg0 := rc rra := &Lret jump [rtmp] Lret: Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() branch (= rc 0) Labort rtmp := m[rc + 8] rtmp := m[rtmp + 12] rarg0 := rc rra := &Lret jump [rtmp] Lret: Challenges hrc : P, … i hrc : nonnull P, …i hrtmp : disp(P), …i hrtmp : meth(P,12), …i hrarg0 : P, …i hrrv : int, …i Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() branch (= rc 0) Labort rtmp := m[rc + 8] rtmp := m[rtmp + 12] rarg0 := rp rra := &Lret jump [rtmp] Lret: unsound Challenges hrc : P, … i hrc : nonnull P, …i hrtmp : disp(P), …i hrtmp : meth(P,12), …i hrarg0 : P, …i hrrv : int, …i Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); int f = p.f; p.m(); x = f + 1; branch (= rp 0) Labort rtmp := rp + 12 rf := m[rtmp] Challenges branch (= rp 0) Labort rtmp := m[rp + 8] rtmp := m[rtmp + 12] rarg0 := rp rra := &Lret jump [rtmp] Lret: reordering and optimization rx := rf + 1 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
branch (= rp 0) Labort rtmp := rp + 12 rf := m[rtmp] rx := rf + 1 branch (= rp 0) Labort rtmp := m[rtmp - 4] rtmp := m[rtmp + 12] rarg0 := rp rra := &Lret jump [rtmp] Lret: Challenges class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); int f = p.f; p.m(); x = f + 1; hrtmp : int ptr, … i “funny” pointer arithmetic Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Outline • Overview • Abstract State • Types • Join algorithm • Verification Procedure • Educational Experience • Concluding Remarks Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Overview • To maximize accessibility (i.e., minimize annotations) • infer types at intermediate program points using abstract interpretation • willingness to have a limited amount of specialization to a compilation strategy • To handle assembly code • use (simple) dependent types • To be less sensitive to the compilation strategy • assign types to intermediate values lazily Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Abstract State • Keep intermediate expressions in symbolic form • Symbolic values abstract irrelevant expressions keeping only type information • Intermediate expressions track many dependencies between registers Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Why Symbolic Values? • Used to track dependencies between registers • Value state form convenient for abstract interpretation r1 = r2Æ r1 = r3 + 4 r1 = + 4, r2 = + 4, r3 = r1 := r2 [r1!(r2)] Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Values • Define a normalization of expressions to values • Values give the expressions of interest • For our case, address computation and comparisons with constants • Would vary depending on the source language or compiler of interest • Not all equal expressions normalize to the same value, so value forms must be chosen carefully • Registers are equal if they map to the same values (after normalization) Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Types • Primitive Types and Object References • Types for Run-time Structures Tracks that the value is the dispatch table of object Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Join h r1 = +4, r2 = +4, r3 = D : disp(), : nonnull exactly Ci h r1 = g+4, r2 = d+8, r3 = D g : disp(d), d : nonnull Ci h r1 = e+4, r2 = , r3 = D e : disp(z), z : nonnull C, : >i (,) ! (,) ! Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Join • Joining of type state basically given by subtyping • subtyping lattice still fairly simple – dictated mostly by the class inheritance hierachy • Values are (fancy) labels for equivalence classes of registers • lattice of conjunctions of equalities on registers • first normalize expressions to values for join • each distinct pair of values in the input state gives an equivalence class in the result state Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Outline • Overview • Abstract State • Types • Join algorithm • Verification Procedure • Educational Experience • Concluding Remarks Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { … } branch (= rp 0) Labort rtmp := rp + 12 rf := m[rtmp] rx := rf + 1 rtmp := m[rtmp - 4] rtmp := m[rtmp + 12] rarg0 := rp rra := &Lret jump [rtmp] Lret: Typing Rule Verification Example hrp = rDr : Pi h… Dr : nonnull Pi hrtmp = r + 12 D …i hrf = D : inti hrx = + 1D : inti Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
class P { int f; int m() { … } } class C extends P { … } branch (= rp 0) Labort rtmp := rp + 12 rf := m[rtmp] rx := rf + 1 rtmp := m[rtmp - 4] rtmp := m[rtmp + 12] rarg0 := rp rra := &Lret jump [rtmp] Lret: Typing Rule Verification Example hrp = rDr : Pi h… Dr : nonnull Pi hrtmp = r + 12 D …i hrf = D : inti + 12 – 4 + 8 hrx = + 1D : inti hrtmp = D : disp(r)i hrtmp = sDs : meth(r,12)i hrarg0 = rD …i Calling meth(r,12) checks rarg0 = r hrrv = mDm : inti Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Outline • Overview • Abstract State • Types • Join algorithm • Verification Procedure • Educational Experience • Concluding Remarks Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Comparing Compiler Development Without Coolaid With Coolaid • Good compilations increase: 53% to 75% Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Comparing Compiler Development Without Coolaid With Coolaid • Type errors decrease: 44% to 19% • even 29% to 15% if only consider those that are visible in testing Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Comparing Compiler Development Without Coolaid With Coolaid • False alarms: 6% to 1% • price for ease of use Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
0: “counter-productive” 6: “can’t imagine being without it” A favorite comment: “I would be totally lost without Coolaid. I learn best when I am using it hands-on …. I was able to really understand stack conventions and optimizations and to appreciate them.” Student Perspective • Traditional testing procedure • 49 test cases / inputs • Mean scores: • 33 (67%) in 2002 • 34 (69%) in 2003 • 39 (79%) in 2004 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging
Conclusion • Extended data-flow based type inference of intermediate languages to assembly code • Able to retrofit existing compilers • Minimal annotations for accessibility • Provided experimental confirmation that certifying compilation helps early compiler debugging • Introduced undergraduates to the idea of improving software quality with program analysis Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging