460 likes | 598 Views
Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability. Jakob Rehof and Manuel F ähndrich Microsoft Research. Common vocabulary Data access paths Function summary Context-sensitivity Directional flow. Type-based Type structure ( ) Function type (->)
E N D
Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research
Common vocabulary Data access paths Function summary Context-sensitivity Directional flow Type-based Type structure () Function type (->) Type instantiation, polymorphism () Subtyping () Type-BasedProgram Analysis
+CS +DI GOAL: Scaleable Flow Analysis of H.O. Programsw. Polymorphic Subtyping • Type-based • Higher-order • Context-sensitive (CS) • Directional (DI) Precision and Cost +CS +DI (, ) -CS +DI (=, ) +CS -DI (,=) -CS -DI (=,=)
Outline • Goals • Problems and Results • Current Flow Analysis w. + • Our Solution • Summary
Current Method • () Polymorphism by copying types • ()Subtyping by constrained types • ( + ) constraint copying
Problems w. Current Method • Constraint copying is expensive (memory) • Constraint simplification is hard • Previous algorithm (Mossin) • No on-demand algorithms (n = size of type-annotated program)
Results • No constraint copying • On-demand queries • All flow in
Outline • Goals • Problems and Results • Current Flow Analysis w. + • Our Solution • Summary
Current Flow Analysis w. + (Mossin) max(s,t) = if s<=t then t else s real * real -> real standard type
Current Flow Analysis w. + max(s:a,t:b) = (if s<=t then t else s) :c {a c, b c} => real:a* real:b-> real:c subtyping constraints analysis type flow label
Current Flow Analysis w. + max(s:a,t:b) = (if s<=t then t else s) :c {a c, b c} => real:a* real:b-> real:c
Current Flow Analysis w. + max(s:a,t:b) : {a c, b c} => real:a* real:b-> real:c max(x0,y0) max(x1,y1)
Current Flow Analysis w.+ max(s:a,t:b) : {a c, b c} => real:a* real:b-> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0c0,b0c0}=>c0
Current Flow Analysis w.+ max(s:a,t:b) : {a c, b c} => real:a* real:b-> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a1c1,b1c1}=>c1 {a0c0,b0c0}=>c0
Current Flow Analysis w.+ with and
Without Subtyping: norm(x ,y ) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s :a’ :a’ real:a* real:a-> real:a
Without Subtyping: norm(x:a’,y:a’) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a* real:a-> real:a
Outline • Goals • Problems and Results • Current Flow Analysis w. + • Our Solution • Summary
B A Flow Analysis Overview Source Code Type Inference Type Instantiation Graph Flow Graph
B A Flow Analysis Overview Source Code Polymorphic Subtyping Type Inference Type Instantiation Graph CFL- Reachability Flow Graph
Eliminating constraint copies max(s:a,t:b) : {ac, bc} =>real:a* real:b-> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0c0, b0c0} => real:a0* real:b0-> real:c0 {a1c1, b1c1} => real:a1* real:b1-> real:c1
1. Get a graph max(s:a,t:b) : real:a* real:b-> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
2. Label instantiationsites max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
3. Represent substitutions max(s:a,t:b) : real:a* real:b-> real:c a a0 a a1 b b0 b b1 c c0 c c1 i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
3.a. … as a graph max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 i i i real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
3.a. … as a graph max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j i i i real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
4. Eliminate constraint copies ! max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j i i i real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
? ? ? max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j i i i real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
Type Theory to the Rescue ! • Polarity (+,-) + -> - + -> + - -> + -
5. Polarities (+,-) max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j - + - i - i i + - real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
6. Reverse negative edges max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j - - + i - i i - + real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
7. Recover flow max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j - - + i - i i - + real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
7. Recover flow max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j - - + i - i i - + real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
7. Recover flow max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 j j j - - + i - i i - + real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
8. Be careful ! max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 Spurious ! j j j - - + i - i i - + real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
9. Do CFL-reachability CFG d max(s:a,t:b) : real:a* real:b-> real:c i: max(x0:a0,y0:b0):c0 j:max(x1:a1,y1:b1):c1 M [kM]k d ]j [j [j [i [i ]i real:a0* real:b0-> real:c0 real:a1* real:b1-> real:c1
Further Issues • Polymorphic type structure • Recursive type structure • context-sensitive data-dependence analysis is uncomputable [Reps 00] • our techniques require finite types • regular unbounded data types handled via finite approximations: recursive type expressions
One-level implementation • GOLF analysis system for C by Manuvir Das (MSR) and Ben Liblit (Berkeley) • Exhaustive points-to sets for MS Word 97, 1.4 Mloc, in 2 minutes
Outline • Goals • Problems and Results • Current Flow Analysis w. + • Our Solution • Summary
Summary • Elimination of constraint copying • Reformulation of polymorphic subtyping with instantiation constraints • Transfer of CFL-reachability techniques to type-based flow analysis
Scaleable Program AnalysisProject (MSR, spt) [ RF, POPL 01 ] +CS +DI (, ) [ Das, PLDI 00 ] [ FRD, PLDI 00 ] -CS +DI (=, ) +CS -DI (,=) -CS -DI(=,=) research.microsoft.com/spa
Summary • Type-based flow analysis • all flow in , n = typed pgm size • context-sensitive (polymorphism) • directional (subtyping) • demand-driven algorithm • incorporates label-polymorphic recursion • works directly on H.O. programs • structured data of finite type • unbounded data structures via approx.
CFL Formulation SPN P M P |[P | N M N |]N | M [kM]k |MM |d |
Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end ac bc aa0, bb0, cc0, aa1, bb1, cc1 ; ; |- e: c0*c1
Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end instantiation constraints ac bc aa0, bb0, cc0, aa1, bb1, cc1 ; ; |- e: c0*c1 subtyping constraints type environment