530 likes | 548 Views
Intraprocedural Dataflow Analysis for Software Product Lines. Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ brabrand@itu.dk ]. Márcio Ribeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ mmr3@cin.ufpe.br ]. Társis Tolêdo
E N D
IntraproceduralDataflowAnalysis forSoftware Product Lines Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ brabrand@itu.dk ] MárcioRibeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ mmr3@cin.ufpe.br ] • TársisTolêdo • Universidade Federal de Pernambuco • [ twt@cin.ufpe.br ] Paulo Borba Universidade Federal de Pernambuco [ phmb@cin.ufpe.br ] Johnni Winther Aarhus University [ jw@cs.au.dk]
< Outline > • Introduction(Software Product Lines) • Dataflow Analyses for Software Product Lines: • A0 (brute force): (feature in-sensitive) [product-based] • A1(consecutive): (feature sensitive) [family-based] • A2(simultaneous): (feature sensitive) [family-based] • A3(sharedsimul.): (feature sensitive) [family-based] • Results: • A0vsA1vsA2vsA3(total time, incl. compilation) • A1vsA2vsA3(analysistime, excl. compilation) • How to combine the analyses: A* • Conclusion(s)
Software Product Line • SPLsbased on ConditionalCompilation: : fF | | #ifdef( ) ... #endif *** null-pointer exception!in configurations: {Ø, {COLOR}} Logo logo; ... ... logo.use(); #ifdef(VIDEO) logo = new Logo(); #endif Example (SPL fragment) Similarly for; e.g.: ■uninitialized vars ■unused variables ■...
Analysis of SPLs • The CompilationProcess: • ...and for Software Product Lines: 0100101 1110110 1010011 1110111 compile run result ERROR! ANALYZE! 0100101 1110110 1010011 1110111 0100101 1110110 1010011 1110111 run generate compile 0100101 1110110 1010011 1110111 run compile run compile result result result 2F ANALYZE! ERROR! ERROR! ANALYZE! ERROR! ANALYZE! Feature-sensitivedata-flow analysis !
Dataflow Analysis L • Dataflow Analysis: • 1)Control-flow graph • 2)Lattice(finiteheight) • 3)Transfer functions(monotone) Example: "sign-of-xanalysis"
Analyzing a Program 1)Program 2)Build CFG 3)Make Equations Annotated with program points 4)Solveequations: fixed-point computation(iteration) 5) SOLUTION (least fixed point):
< Outline > • Introduction • Dataflow Analyses for Software Product Lines: • A0 (brute force): (feature in-sensitive) [product-based] • A1(consecutive): (feature sensitive) [family-based] • A2(simultaneous): (feature sensitive) [family-based] • A3(sharedsimul.): (feature sensitive) [family-based] • Results: • A0vsA1vsA2vsA3(total time, incl. compilation) • A1vsA2vsA3(analysistime, excl. compilation) • How to combine the analyses: A* • Conclusion(s)
A0 L feature in-sensitive! void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } • A0 (brute force): N = O(2F)compilations! ψFM = A∨B _ _ _ | | | c = {A}: c = {B}: c = {A,B}: int x= 0; int x= 0; int x= 0; 0 0 0 x++; x++; x++; + x--; x--; x--; 0/+ + -
A1 L feature sensitive! void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } • A1 (consecutive): ψFM = A∨B _ _ _ | | | c = {A}: c = {B}: c = {A,B}: int x= 0; int x= 0; int x= 0; ✓ ✓ ✓ 0 0 0 0 x++; x++; x++; ✓ ✓ ✗ A: A: A: + + + ✓ ✗ ✓ B: B: B: x--; x--; x--; 0/+ -
A2 L feature sensitive! void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } • A2 (simultaneous): ψFM = A∨B _ _ _ | | | ∀c∈ {{A},{B},{A,B}}: ({A} = , {B} = , {A,B} = ) ✓ ✓ ✓ int x= 0; 0 0 0 0 ({A} = , {B} = , {A,B} = ) ✗ ✓ ✓ x++; A: + + + ({A} = , {B} =, {A,B} = ) ✗ ✓ ✓ B: x--; 0/+ - ({A} = , {B} = , {A,B} = )
A3 L feature sensitive! void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } • A3 (shared): ψFM = A∨B _ | ψFM = A∨B: ( [[ψ]] = ) int x= 0; can use BDD representation ! (compact+efficient) 0 0 ( [[ψ]] = ) x++; A: (A∨B)∧¬A∧¬B ≡ false i.e., invalid given wrt.the feature model, ψ! (although our evaluation: bit vector representation) + 0 + ( [[ψ]] = , [[ψ]] = ) ∧A ∧¬A B: x--; - 0/+ ( [[ψ∧¬A ]] = , [[ψ∧A ]] = , [[ψ∧¬A ]] = , [[ψ∧A ]] = ) ∧¬B ∧B ∧B ∧¬B
Summary A1 A0 A3 Analyzing program: void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } A2 ψFM = A∨B
< Outline > • Introduction • Dataflow Analyses for Software Product Lines: • A0 (brute force): (feature in-sensitive) [product-based] • A1(consecutive): (feature sensitive) [family-based] • A2(simultaneous): (feature sensitive) [family-based] • A3(sharedsimul.): (feature sensitive) [family-based] • Results: • A0vsA1vsA2vsA3(total time, incl. compilation) • A1vsA2vsA3(analysistime, excl. compilation) • How to combine the analyses: A* • Conclusion(s)
Intraprocedural Evaluation • Five (qualitatively different)SPL benchmarks: intraproceduralimpl based on SOOT and CIDE
Total Time (incl. compile) • Tasks: • In practice: Feature sensitive (A1, A2, and A3)all faster than A0 (Reaching Definitions) (no re-compile!) 4x 7x 1x 1x 3x
Analysis Time (excl. compile) • Tasks: • In practice: A2 faster than A1 (Reaching Definitions) (caching!) A3faster than A2 (sharing!)
Beyond the Sum of all Methods • For a method with x # valid configurations, which of analyses A1vsA2vsA3 is fastest? Statistically significant differences between A1, A2, and A3 for all N,except between A2 and A3 for N=4 (underlined above).
Combo Analysis Strategy: A* • Intraprocedurally combinedanalysis strategy, A*: A* consistently fastest (combo!)
< Outline > • Introduction • Dataflow Analyses for Software Product Lines: • A0 (brute force): (feature in-sensitive) [product-based] • A1(consecutive): (feature sensitive) [family-based] • A2(simultaneous): (feature sensitive) [family-based] • A3(sharedsimul.): (feature sensitive) [family-based] • Results: • A0vsA1vsA2vsA3(total time, incl. compilation) • A1vsA2vsA3(analysistime, excl. compilation) • How to combine the analyses: A* • Conclusion(s)
Overview Friday "SPLLIFT: Transparent and Efficient Reuse of IFDS-based Static Program Analyses for Software Product Lines" ( Bodden, Ribeiro, Tolêdo, Brabrand, Borba, Mezini) PLDI 2013: IFDS➞IDE (lift) A*(combo) A3+BDD (esp. inter- procedural) combo! (intra-procedural) repr! A3 (shared) sharing! A2 (simultaneous) IFDS (graph repr) FASTER caching! A1 (consecutive) graph encoding! no re-compile! A0 (brute force)
Conclusion(s) • It is possible to analyzeSPLsusingDFAs • Wecanautomatically"lift"anydataflowanalysis and make it feature sensitive: • A1, A2, A3are all faster thanA0 (no re-compile!) • A2is faster thanA1 (caching!) • A3 is faster thanA2 (sharing!) • A* is fastest (combo!) • A3saves lots of memoryvsA2(sharing!) A1 (consecutive)➞A2 (simultaneous) ➞ A3 (shared)➞A*(combined)
< Obrigado*> *)Thanks
A0vsIFDS and A2vsSPLLIFT A0: IFDS: {x} y x 0 • λS . (S – {x}) ∪ {y} x y 0 {y} LIFT: A2: SPLLIFT(IFDS ➞ IDE): true true A∧B ({A} = {x},{B} = {x} ,{A,B} = {x,y}) y x 0 #ifdef (A) ¬A A • λS . (S – {x}) ∪ {y} ¬A A: x y 0 ({A} = {y},{B} = {x} ,{A,B} = {y}) true true∧¬A = ¬A [ (A∧B)∧¬A] ∨ [true∧A]= A
INTRO: Software Product Lines Dataflow Analysis
Abstract • Software product lines (SPLs) developed using annotative approaches such as conditional compilation come with an inherent risk of constructing erroneous products. For this reason, it is essential to be able to analyze such SPLs. However, as dataflow analysis techniques are not able to deal with SPLs, developers must generate and analyze all valid products individually, which is expensive for non-trivial SPLs. • In this paper, we demonstrate how to take any standard intraproceduraldataflow analysis and automatically turn it into a feature-sensitive dataflow analysis in five different ways where the last is a combination of the other four. All analyses are capable of analyzing all valid products of an SPL without having to generate all of them explicitly. • We have implemented all analyses using SOOT’s intraprocedural dataflow analysis framework and experimentally evaluated four of them according to their performance and memory characteristics on five qualitatively different SPLs. On our benchmarks, the combined analysis strategy is up to almost eight times faster than the brute-force approach.
< Outline > • Introduction • Software Product Lines • Dataflow Analysis (recap) • Dataflow Analyses for Software Product Lines: • feature in-sensitive(A0)vsfeature sensitive(A1, A2, A3) • Results: • A0vsA1vsA2vsA3(in theory and practice) • Related Work • Conclusion
Introduction • Traditional Software Development: • One program = One product • Product Line: • A ”family” of products (of N ”similar” products): = = = 1x CAR 1x CELL PHONE 1x APPLICATION CARS CELL PHONES APPLICATIONS customize SPL: (Family ofPrograms)
Software Product Line • SPL: • Feature Model: (e.g.: ψFM ≡ VIDEO COLOR) Ø Family of Programs: customize { Color} COLOR VIDEO 2F COLORVIDEO { Video } VIDEO Set of Features: F = { COLOR, VIDEO } { Color, Video } Configurations: Ø,{Color},{Video},{Color,Video} 2F VALID
Software Product Line Conditional compilation: • SPL: Family of s: : fF | | Program COLOR VIDEO #ifdef( ) ... #endif Alternatively,via Aspects(as in AOSD) COLORVIDEO VIDEO *** null-pointer exception!in configurations: {Ø, {COLOR}} Logo logo; ... ... logo.use(); #ifdef (VIDEO) logo = new Logo(); #endif Example (SPL fragment) Similarly for; e.g.: ■uninitialized vars ■unused variables ■...
Analysis of SPLs • The Compilation Process: • ...and for Software Product Lines: 0100101 1110110 1010011 1110111 compile run result ERROR! ANALYZE! 0100101 1110110 1010011 1110111 0100101 1110110 1010011 1110111 run customize compile 0100101 1110110 1010011 1110111 run compile run compile result result result 2F ANALYZE! ERROR! ERROR! ANALYZE! ERROR! ANALYZE! Feature-sensitivedata-flow analysis !
< Outline > • Introduction • Software Product Lines • Dataflow Analysis (recap) • Dataflow Analyses for Software Product Lines: • feature in-sensitive(A0)vsfeature sensitive(A1, A2, A3) • Results: • A0vsA1vsA2vsA3 (in theory and practice) • Related Work • Conclusion
Dataflow Analysis L • Dataflow Analysis: • 1)Control-flow graph • 2)Lattice(finiteheight) • 3)Transfer functions(monotone) Example: "sign-of-xanalysis"
Analyzing a Program 1)Program 2)Build CFG 3)Make Equations Annotated with program points 4)Solveequations: fixed-point computation(iteration) 5) SOLUTION (least fixed point):
< Outline > • Introduction • Software Product Lines • Dataflow Analysis (recap) • Dataflow Analyses for Software Product Lines: • feature in-sensitive(A0)vsfeature sensitive(A1, A2, A3) • Results: • A0vsA1vsA2vsA3 (in theory and practice) • Related Work • Conclusion
Related Work (DFA) • Path-sensitive DFA: • Idea of “conditionally executed statements” • Compute different analysis info along different paths (~ A1, A2, A3) to improve precision or to optimize “hot paths” • Predicated DFA: • Guard lattice values by propositional logic predicates (~ A3), yielding “optimistic dataflow values” that are kept distinct during analysis (~ A2 and A3) “Constant Propagation with Conditional Branches” ( Wegman and Zadeck ) TOPLAS 1991 “Predicated Array Data-Flow Analysis for Run-time Parallelization” ( Moon, Hall, and Murphy ) ICS 1998 Our work:Automatically lift anyDFA to SPLs (with ψFM) ⇒feature-sensitive analysis for analyzing entire program family
Related Work (Lifting for SPLs) • Model Checking: • Type Checking: • Parsing: • Testing: Model checks all SPLs at the same time (3.5x faster) than one by one! (similar goal, diff techniques) Model Checking Lots of Systems: Efficient Verification of Temporal Properties in Software Product Lines” ( Classen, Heymans, Schobbens, Legay, and Raskin ) ICSE 2010 Type checking ↔ DFA (similar goal, diff techniques) Our: auto lift any DFA (uninitvars, null pointers, ...) “Type-Checking Software Product Lines - A Formal Approach” ( Kastnerand Apel ) ASE 2008 “Type Safety for Feature-Oriented Product Lines” ( Apel, Kastner, Grösslinger, and Lengauer) ASE 2010 (similar techniques, diff goal): Split and merging parsing (~A3) and also uses instrumentation “Variability-Aware Parsing in the Presence of Lexical Macros & C.C.” ( Kastner, Giarrusso, Rendel, Erdweg, Ostermann, and Berger )OOPSLA 2011 Select relevant feature combinations for a given test case Uses (hardwired) DFA (w/o FM) to compute reachability “Reducing Combinatorics in Testing Product Lines” ( Hwan, Kim, Batory, and Khurshid) AOSD 2011
Emerging Interfaces CBSoft 2011: *** Best Tool Award *** "A Tool for Improving Maintainability of Preprocessor-based Product Lines" ( MárcioRibeiro, TársisTolêdo, Paulo Borba, Claus Brabrand )
Specification: A0, A1, A2, A3 A0 A1 A2 A3
Analysis Time (excl. compile) • In theory: • In practice: TIME(A3) : Depends ondegree of sharing in SPL ! (Reaching Definitions) A2 faster than A1 (caching!) A3faster than A2 (sharing!)
Memory Usage • In theory: • In practice: SPACE(A3) : Depends ondegree of sharing in SPL ! (Reaching Definitions)
Analysis Time (excl. compile) Nx1 ≠ 1xN ?! • In practice: (Reaching Definitions) A2 faster than A1 Caching!
Caching(A1 vs A2) • Cache misses(A1vsA2): • Cache enabled: • This is the "normal condition" (for reference) • Cache disabled*: • As hypothesized, this indeed affects A1more than A2 • i.e.,A2has better cache properties thanA1 *) we flush the L2 cache, by traversing an8MB “bogus array” to invalidate cache!
IFDEF normalization • Refactor"undisciplined"(lexical) ifdefs into "disciplined"(syntactic) ifdefs: • Normalize "ifdef"s (by transformation):
Lexical#ifdef Syntacticifdef • Simple transformation: • We do not handle non-syntactic '#ifdef's: Nestedifdef'salso give rise to a conj. of formulas Fair assumption (also in CIDE)
BDD (Binary Decision Diagram) = F(A,B,C)= A(BC) A A BDD minimized BDD B B B C C C C C • Compact and efficientrepresentation forboolean functions (aka., set of set of names) • FAST: negation, conjunction, disjunction, equality !
Formula ~ Set of Configurations • Definitions (given F, set of feature names): • f Ffeature name • c 2Fconfiguration(set of feature names) cF • X 22set of config's (set of set of feature names)X 2F • Exampleifdefs: F [[ BA]] = { {A}, {B}, {A,B} } F = {A,B} [[ A(BC)]] F = {A,B,C} = { {A,B}, {A,C}, {A,B,C} }
Feature Model (Example) Note: | [[FM]]| = 3<32 = |2F| • Feature Model: • Feature set: • Formula: • Set of configurations: Engine Air Air 1.0 1.4 F= {Car, Engine, 1.0, 1.4, Air} [[ ]] = FM Car Engine (1.01.4) Air1.4 { {Car, Engine, 1.0}, {Car, Engine, 1.4}, {Car, Engine, 1.4, Air} }