1 / 16

Data-Flow Analysis Framework

Data-Flow Analysis Framework. Domain What kind of solution is the analysis looking for? Ex. Variables have not yet been defined Algorithm assigns a set of assertions to each node/edge Approximation Useful data-flow properties are never 100% accurate Rice’s Theorem, from 1953

marty
Download Presentation

Data-Flow Analysis Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-Flow Analysis Framework • Domain • What kind of solution is the analysis looking for? • Ex. Variables have not yet been defined • Algorithm assigns a set of assertions to each node/edge • Approximation • Useful data-flow properties are never 100% accurate • Rice’s Theorem, from 1953 • Lower approximation is called a MUST analysis • Set of solutions found is smaller than the set of actual solutions • Upper approximation is called a MAY analysis • Set of solutions found may be larger than the set of actual solutions

  2. Data-Flow Analysis Framework • Direction • Forwards: For each node/edge, computes information about past behavior • Backwards: For each node/edge, computes information about future behavior • Transfer Functions • JOIN: Specifies how information from adjacent nodes /edges is propagated • MAY: Union of adjacent edges • MUST: Intersection of adjacent edges • GEN: Specifies which possible solutions are generated at the node/edge • KILL: Specifies which possible solutions are removed at that node/edge

  3. Data-Flow Algorithm • Start at the top (bottom) of the CFG • Forwards: top • Backwards: bottom • At each node compute: (JOIN() – KILL(node)) U GEN(node) At each branch: Follow all paths, in any order, up to node where path merges Once all paths up to merge are complete, continue at merge node • If all JOIN edges are not yet computed, • use empty set (MAY) • universal set (MUST) • For loops: • repeat until the solution for all nodes in loop doesn’t change • Called the “fixed-point”

  4. Liveness • A variable is live at a node if its current value can be read during the remaining execution of the program • Domain: program variables • Backwards MAY analysis

  5. Liveness Example

  6. Liveness Transfer Functions • Exit • GEN(exit) = { } • KILL(exit) = { } • Conditions and Output • GEN(stmt) = Set of all variables appearing in the statement • KILL(stmt) = { } • Assignment • GEN(assignment) = Set of all variables appearing on the right-hand side • KILL(assignment) = Set with variable being assigned to • Declaration • GEN(declaration) = { } • KILL(declaration) = Set of variables being declared • Other • GEN(other) = { } • KILL(other) = { }

  7. Liveness Example {x, y} {x} {x, y} { } {x} true false { } {x} {x, z} {x, z} START true false {x, z} { x } { } END

  8. Liveness Application • Memory Allocation • Since y and z are never live at the same time, they can share the same memory location • Performance Optimization • Assignment, z = z – 1, is never used

  9. Liveness Application • Bug Checking (z = z – 1) is dead on assignment • FindBugs says: “This instruction assigns a value to a local variable, but the value is not read or used in any subsequent instruction. Often, this indicates an error, because the value computed is never used. “

  10. Data-Flow Framework Summary • Generic framework for different analyses • Each analysis defines • Domain • Approximation • Direction • Transfer Functions • Used for optimization, verification, and testing

  11. Reaching Definitions • An assignment statement that may have defined the value of a variable at a particular node • Domain: assignment statements • Forwards MAY analysis

  12. Reaching Definitions Transfer Functions • Assignments • GEN(assignment) = the statement itself • KILL(assignment) = Statements that assigned to the same variable • Declaration • GEN(decl) = the statement itself • KILL(decl) = 0 • Other • GEN(other) = 0 • KILL(other) = 0

  13. Reaching Definitions Example

  14. Reaching Definitions Example START { } a1 x = input {a1} {a1, a2, a3, a5, a6} a2 {a1} x > 1 y = x/2 {a1, a2, a3, a5, a6} a3 {a1, a2} y > 3 x = x - y {a1, a2} {a1, a2, a3, a5, a6} a4 {a2, a3, a6} {a2, a3} z = x - 4 a5 {a1, a2, a3, a4, a5} {a1, a2, a3, a4} z > 0 x = x/2 {a1, a2, a3, a5, a6} a6 {…} {a2, a4, a5} output x z = z - 1 {a1, a2, a3, a5, a6} END

  15. Reaching Definitions Applications • FindBugs: “NP: Possible null pointer dereference” • Debugging • “Slicing” tools • Following chains of Reaching Definitions backwards to track down bugs • Basis for Information Flow Security • Discuss in lectures on Security

  16. Exercise • Compute the reaching definitions for each node, using the iterative dataflow algorithm. • Show solutions for each loop iteration. 1: function test(r1, r2, r3, r4, r5) { 2: while(r1 < 10) { 3: r1 = r1 + 1; 4: r5 = r1 * 2; 5: if((r1 % 2) == 0) 6: r2 = 0; 7: else 8: r2 = r2 + 1; 9: r4 = r2 + r1; 10: } 11: return r4 + r5; 12: }

More Related