Implementing Next Generation Points-To in Open64

Implementing Next Generation Points-To in Open64 Rick Hank, Loreena Lee, Rajiv Ravindran, Hui Shi Java, Compilers & Tools Lab, Hewlett Packard, Cupertino, California

Requirements • Points-To Algorithm • Scalable – we must be able to apply this algorithm to large applications • Flow-insensitive – state of the art analysis employed in production compilers is flow-insensitive (see scalability) • Context-sensitive – must provide at least context sensitivity for heap allocations • Field-sensitive – must be able to disambiguate structure field references • Implementation • Preserve existing interfaces • ALIAS_MANAGER • ALIAS_RULE • Preserve existing flow-sensitive alias (WOPT) • Provide a mechanism for extensibility

Approach • Andersen (or inclusion)-style points-to analysis • Intel/gcc compiler’s employ inclusion-style points-to analysis • Anything less precise does not place Open64 ahead of the curve • Modeled after work done by Erik Nystrom • Only work we found that studied the combination of inclusion, field-sensitivity and context-sensitivity • Fulcra Pointer Analysis Framework http://impact.crhc.illinois.edu/ftp/report/phd-thesis-erik-nystrom.pdf • Implementation heavily influenced by Eric’s IMPACT implementation • Several changes made to core algorithm to reduce complexity and improve scalability • Constraint graph solution method heavily influenced by: • Wave Propagation and Deep Propagation for Pointer Analysis, CGO 2009

What is a Constraint Graph? • Nodes correspond to symbols • In our implementation a node represents an <ST, offset> pair. • Edges provide the constraints • Traditionally there are four types of constraints • Field sensitivity adds a fifth (Skew) • Represent a subset relation • Example 1: A = B • The points-to set of B is a subset of the points-to set of A • Example 2: A = *B • If x is contained in the points-to set of B, then the points to set of x is a subset of the points-to set of A • Solving: Computing the transitive closure

Constraint Graph SOLVER (simplified) ConstraintGraphSolve::solveConstraints() { do { findAndMergeSCCs(); // provide topological order while (!copySkewList.empty()) { ConstraintGraphEdge *e = copySkewList.pop(); e->process(); } while (!loadStoreList.empty()) { ConstraintGraphEdge *e = loadStoreList.pop(); e->process(copySkewList); } } while (!copySkewList.empty()); }

Constraint graph: Example int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 *= a,0 t2 h,0 {h,0}

Constraint graph: Example b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 *= *= +4 t1 a,0 t2 h,0 {h,0}

Constraint graph: Example b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 *= *= +4 *= q,0 t1 a,0 {a,0} t2 h,0 {h,0}

Constraint graph: Example b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 *= *= = +4 *= q,0 t1 a,0 {a,0} t2 h,0 {h,0}

IPA Phase ordering

Alias Analysis Interface • Complex analysis cannot be done repeatedly, ala alias classification • Motivates WHIRL annotations to facilitate access to alias information • AliasTag • Provides mapping from WN to alias solution • Existing ALIAS_MANAGER, ALIAS_RULE interfaces use AliasTag to perform alias queries • Challenge: • How to preserve the WN  AliasTag association across BE? • Each IR lowering phase provides an opportunity to drop the association • Each WHIRL  CODEREP  WHIRL translation provides an opportunity to drop the association • AliasTags are associated with existing POINTS_TO and mapped into AUX_STAB, OCC during WOPT • AliasAnalyzer • Provides “generic” alias analysis interface • Theoretically, alias classification could be abstracted behind this interface. • Derived class NystromAliasAnalyzer implements our inclusion-based algorithm at –O2 • Derived class IPA_NystromAliasAnalyzer implements our inclusion-based algorithm at -ipa

Summary DETAILS • Alias analysis spans IPL, IPA and BE • Initial constraint graphs are constructed/solved during IPL • Conveyed to IPA via standard summary mechanism • Summary information • SUMMARY_CONSTRAINT_GRAPH_NODE • SUMMARY_CONSTRAINT_GRAPH_EDGE • SUMMARY_CONSTRAINT_GRAPH_STINFO • SUMMARY_CONSTRAINT_GRAPH_CALLSITE • SUMMARY_CONSTRAINT_GRAPH_MODRANGE • Added support for passing summary from IPA to BE • New .ipa_summary section in .I files • Local summary (per PU) only • Convey only what is necessary to answer alias queries from backend components • Provide nodes and points-to sets, but no constraints (edges)

IPA Solver (Simplified) IPA_NystromAnalyzer::solve() { do { // Top down, context insenstive do { prepare(); // connect actuals/formals, SCC det solve(); update(); // find resolve icalls } while(change); // Bottom up, context sensitive for (PU in rev-topologial order) { apply-summaries(); // inline callee summary solve(); } update(); // find resolved icalls } while (change); }

Constraint Graph: IPA Connect b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 *= *= = +4 *= q,0 t1 a,0 {a,0} = t2 h,0 {h,0}

Constraint graph: IPA COPY/Skew b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 {h,0} *= *= = +4 *= q,0 t1 a,0 {a,0} {h,4} = t2 h,0 h,4 {h,0}

Constraint graph: IPA Load/STore b,0 int a, b; struct foo { int x; int y; }; typedef struct foo FOO; FOO *ex() { int *q; FOO *p; p = (FOO *) alloc(sizeof(FOO)); p->x = a; p->y = b; q = &a; *q = b; return p; } void *alloc(int n) { return malloc(n); } p,0 {h,0} *= *= = +4 *= q,0 t1 a,0 {a,0} {h,4} = = = t2 h,0 h,4 {h,0}

RESULTS SOLVERTIME

RESULTS PRECENTAGE OF TOTAL QUERIES THAT RETURN “NO ALIAS “ PERFORMANCE IMPROVEMENT

Challenges/Opportunities • WHIRL IR • Not strongly typed – field annotations are informational • Flattened – lack of index operators make it difficult to determine field being referenced • Preserving WN  AliasTag mapping across lowering, IR translation • Improve escape analysis to reduce effects of pointers escaping through external calls • WOPT memory SSA is not field sensitive • Use of single virtual symbol negates most (any?) benefit • Opportunity to leverage points-to sets to partition OCCS into multiple virtual symbols • Convergence • Number of offsets being modeled – large points-to sets • Number of symbols being modeled – large points-to sets • Call graph is not mutable • Alias analysis cannot update the call graph as indirect/virtual calls are resolved • Opportunity for down stream transformations to make use of more precise call graph • Inlining • Standalone inliner may not be sufficient • Code re-use in C++ seems to be problematic

Status • Done: • Public branch on top-of-trunk (open64.net) http://svn.open64.net/svnroot/open64/branches/nextgenalias • Implementation at –O2/O3/-Ofast • Tested SPEC 2006/2000 fp/int • To Do: • Context sensitive solution • Implement summary creation, inlining • Address some scalability issues • Points-to sets are large due to large number of modeled offsets and conservative collapsing • Solution: • Control the number of offsets we are willing to implement • Collapse symbols where modeling both is unnecessary • Address remaining issues with “invalid AliasTag” • Tags lost during WHIRL transformations • More extensive correctness validation • Improve vsyms • Merge to ToT

THANKS!QUESTIONS ?

BACKUP

Constraint Graph Properties

Context Sensitivity • Heap sensitivity • Calls to known heap allocation routines produce a local “heap” symbol • E.g. malloc, calloc, new, etc. • Leverage new call side-effect table • The “heap” symbol will be cloned when the summary constraint graph is inlined during the context sensitive walk of the call graph • Unfortunately full context sensitivity is not yet implemented • May provide a solution to some of the convergence issues we are seeing by reducing points-to set sizes • Likely provide additional challenges as we will significantly increase the number of nodes • Interim solution – identify heap allocation wrappers • Wrappers are identified and marked during IPL • Provides one heap symbol per wrapper call site • E.g. MallocOrDie • Additional “heap” symbols introduced when connecting actual/formal parameters

Implementing Next Generation Points-To in Open64