1 / 32

Quantifying Uncertainty in Points-To Relations

Quantifying Uncertainty in Points-To Relations. University of Edinburgh http://www.homepages.inf.ed.ac.uk/mc/Projects/VESPA. Constantino Ribeiro and Marcelo Cintra. Contributions. Scope Measure and compare sizes of static vs. dynamic points-to sets from context- and flow-sensitive algorithm

elvin
Download Presentation

Quantifying Uncertainty in Points-To Relations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quantifying Uncertainty in Points-To Relations University of Edinburgh http://www.homepages.inf.ed.ac.uk/mc/Projects/VESPA Constantino Ribeiro and Marcelo Cintra

  2. Contributions • Scope • Measure and compare sizes of static vs. dynamic points-to sets from context- and flow-sensitive algorithm • Goal • Quantification of may-alias behavior that is intrinsic to applications • Classification of reasons for difference between static prediction and run-time behavior • Relevance • Important step toward future aggressive (speculative) optimizations This work is not about a new pointer analysis algorithm LCPC 2006

  3. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  4. Compiler Optimizations • To make good optimizations a compiler must have accurate knowledge of: • Data flow: • Redundant variable elimination • Constant propagation • Register allocation • Control flow: • Dead code elimination • Instruction scheduling LCPC 2006

  5. Data Flow Analysis • Data flow analysis: difficult to achieve 100% of precision • Use of pointers variables • Same pointer may refer to different memory objects at different times • Same pointer may refer to many memory objects at some program point • Use of procedures • Side effects caused by call by reference and access to global data • Presence of control flow structures • Multiple def-use chains LCPC 2006

  6. Real Points-to Behavior • So we want to • Understand the points-to behavior in real applications • Discover the causes of the ambiguities from static analysis • Facilitate more aggressive optimizations for ambiguous points-to LCPC 2006

  7. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  8. Points-to analysis • Data Dependence Analysis for pointer variables • At each point of the program: set of pointer variables and the locations that they point to • Pointer variables may point to an address or to many addresses • Pointer variables can even point to other pointers • Many possible points-to targets restrict optimizations in conservative compilers • Procedures and their call increase complexity and time of the analysis 1 2 4 3 8 7 LCPC 2006

  9. Types of Algorithms • Sensitivity: • Flow-sensitive + Context-sensitive → more precise analysis • Granularity: • Fine: individual fields of complex data structures • Coarse: whole data structures and arrays • Naming of dynamically created memory objects: • Single name “heap” • Per memory allocation site • Per context LCPC 2006

  10. Formal Representation • Location sets or locsets: individual named memory locations where: • Points-to relations (R): tuples (p,v) where p: pointer v: location set • P and V: set of pointers and location sets where R  P × V :points-to relation • Every tuple (p, v)  R means:pointer p may point to location set v p → v • Points-to graph: G = (N, E) of N = P  Vnodes andE = Redges LCPC 2006

  11. Formal Representation • Analysis: compute points-to graph to: • Basic dataflow equations that make pointer manipulation operations: • p1 = &p2; (Address-of assignment) • p1 = p2; (Copy assignment) • p1 = *p2; (Load assignment) • *p1 = p2; (Store assignment) • Resulting in: points-to graph to all points-to relationships: • Definitely points-to • Possibly points-to LCPC 2006

  12. Formal Representation Where: • Definitely points-to: R = {(p, v)} only p = &v • Possibly points-to: R = {(p, v),(p, z)} either p = &v or p = &z LCPC 2006

  13. Causes of Uncertainty in Pointer Analysis • Control flow • Pointer arithmetic • Unavailable procedure code • Recursive data structures • Aggregate data structures • Dynamically allocated objects LCPC 2006

  14. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  15. Static Source Code Analysis • An extension of Rugina and Rinard’s Context- and flow-sensitive pointer analysis algorithm with following new features: • Number of accesses with pointer de-reference • Number of used and modified locsets that occurs just before of: • Indirect use of a variable : ... = *p; • Indirect modification of a variable: *p = ...; • Multi-level indirect use of variable: ... = * * p; • Multi-level indirect modification of variable: * * p = ...; • Procedure call: foo(..., *p, ...); • Loops : one instance of the cases above per pointer de-reference • Procedures : one instance of each pointer de-reference percalling context LCPC 2006

  16. Run-time Statistics Collection • Our tool inserts additional profiling code that: • Records all different run-time memory addresses • Counts the number of accesses to each different address • Each run-time access has aunique identifier (source code number) thatmatches the run-time / static access • Problem: • Possible mismatches between static and dynamic: • Multiple static accesses may map to the same source code line with thesame run-time counter: • The pointer analysis algorithm separates static accesses according to their context • Not all static accesses may appear at run time: • Portion of the code not executed due to input data LCPC 2006

  17. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  18. Experimental Setup • Applications: • SPEC2000 integer • Except gcc, gap, vortex and eon • MediaBench • SPEC2000 fp tried but found to be not interesting as a pointer analysis problem • Standard input set used with run-time experiments LCPC 2006

  19. Applications Characteristics LCPC 2006

  20. Applications Characteristics LCPC 2006

  21. Static Analysis Tool • Extension of SPAN package that: • Records all instances of pointer de-references + number of possible targets + source code line number • Uses and modifications via pointer de-references counted separately • Static de-references to potentially uninitialized pointers use a special location set (unk) and are counted separately • Static de-references to dynamically allocated memory use a special location set (heap.X, where X is context id) and are counted separately LCPC 2006

  22. Static Analysis Results LCPC 2006

  23. Static Analysis Results LCPC 2006

  24. Profiling Environment • Monitor the actual run-time behaviour of static pointer de-references withmultiple possible targets • SPAN extension include profiling code where: • static de-reference has multiple targets and thenrecord the actual address accessed + counter per address • Instrumented code isconverted (SUIF format (.spd) to C code) • Compiled (Intel x86 platform, gcc 3.4.4, -O2 optimization level) LCPC 2006

  25. Run-time Uncertainty 59 + 1 + 24 = 84 59 + 1 + 1 + 23 = 84 LCPC 2006

  26. Causes of Uncertainty LCPC 2006

  27. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  28. Related Work • Algorithms: • The basic SUIF1 package used in our study (SPAN) was introduced by R. Rugina and M. Rinard (PLDI ‘1999); • E. M. Nystrom et al proposed a fast and efficient summary-based pointer analysis algorithm (SAS ‘04); • M. Hind discussed main pointer analysis research and talked about unsolved questions (PASTE ‘01) - SURVEY; • Quantification of run-time behavior: • Few works investigated the impact of pointer analysis on overall compiler optimization like B. Cheng and W. M. Hwu, M. Das et al, R. Ghiya et al (SIGPLAN ‘00 - PLDI , SAS ‘04, SIGPLAN ‘01– PLDI); • A attempted to quantify the run-time behavior of points-to sets was done by M. Mock et al (PASTE ‘01); • D. Liang et al is similar to previous work but using Java programs (ISSTA ‘02); LCPC 2006

  29. Related Work • Speculative probabilistic analysis: • A quantitative computation of static points-to results against run-time behavior in a probabilistic framework was proposed by Y. S. Hwang et al (LCPC ‘01) • Support for speculative analysis of points-to was proposed by J. Lin, T. Chen et al (PLDI ‘03) • G. Ramalingam proposed to extend static analysis with probabilistic information reflecting the actual run-time behavior (SIGPLAN ‘01– PLDI) LCPC 2006

  30. Outline • Motivation • Pointer Analysis • Evaluation Methodology • Experimental Setup and Results • Related Work • Conclusions LCPC 2006

  31. Conclusions • For most of the benchmarks static pointer analysis is very accurate • For some benchmarks up to 25% of the de-references cannot be statically fully disambiguated • 27% of these de-referencesaccess a single memory location at run time, but many do access several different memory locations • Results suggest further compiler optimizations exploiting cases where the uncertainty does not appear at run time • We need to improve the handling of pointer arithmetic • New probabilistic approaches thatcapture actual control flow behavior LCPC 2006

  32. Quantifying Uncertainty in Points-To Relations University of Edinburgh http://www.homepages.inf.ed.ac.uk/mc/Projects/VESPA Constantino Ribeiro and Marcelo Cintra

More Related