130 likes | 275 Views
Pointer Analysis – A Survey. Vishwanath Raman (call me vishwa please) vishwa@soe.ucsc.edu Dec. 1, 2004. I did say Pointer Analysis. He certainly is a pointer.. By first impressions he seems earnest. But also wickedly gleeful. The puffing out of his chest is suggestive. Seriously….
E N D
Pointer Analysis – A Survey Vishwanath Raman (call me vishwa please) vishwa@soe.ucsc.edu Dec. 1, 2004
I did say Pointer Analysis • He certainly is a pointer.. • By first impressions he seems earnest. • But also wickedly gleeful. • The puffing out of his chest is suggestive.. Seriously…
?/! (clearly borrowed) • What is it? • For variables of pointer type, what are the objects they may point to at runtime. • Where is it used? • Compiler optimizations – register allocation, constant propagation. • Bug detection – NULL pointer dereference. • Security violations – buffer overruns. • Tracking resource usage in static schedulers.
Example Consider the following C snippet - int x, y, *p, **q; p = &x; q = &p; *q = &y; Points-to set: [q -> {p}, p -> {x, y}]
The survey covers? • Analysis based on a type system by Bjarne Steensgaard (Microsoft Research). • Analysis based on BDDs from the SABLE group at McGill. • An application of pointer analysis for bug detection from the SUIF group at Stanford.
In the interest of time… BDD based approach A BDD is a directed acyclic graph used to represent boolean functions and state spaces. • Interpreted as sets – • S = {11} • S = {01, 10, 11} 1. 2.
Analyze this... a = allocate; // encode a = 00, location = 00 b = allocate; // b = 01, location = 01 c = allocate; // c = 10, location = 10 a = b; c = b; Points-to for allocate, Y = {(a, A), (b, B), (c, C)} Points-to for assignments X = {(b, a), (b, c)} In terms of bit strings (each bit is a BDD var) - {(0000), (0101), (1010)} and {(0100), (0110)}
BDD operations to die for RelProd(X, Y, V1) = {(v2, h) | <there exists>1 v1. ((v1, v2) ε X and (v1, h) ε Y)} Points-to for allocate, Y = {(a, A), (b, B), (c, C)} Points-to for assignments, X = {(b, a), (b, c)} RelProd(X, Y, V1) = {(a, B), (c, B)} To get well formed BDDs, there are two variable domains (V1 and V2) with the same encoding 1Can someone please tell me how to get the $#!@% symbol for “there exists” in Windows
More operations to die for Replace will replace variables in one domain with variables from another domain Replace ( RelProd ( X , Y , V1 ) ) = {(a, B), (c, B)} Now, a and c are from the V1 domain as opposed to the V2 domain.
Union has the usual meaning Union ( Replace ( RelProd ( X , Y , V1 ) ), Y ) = {(a, A), (b, B), (c, C), (a, B), (c, B)} as desired. Remember the program : a = allocate; b = allocate; c = allocate; a = b; c = b;
An epitome of elegance • The types based approach defines a type system over a storage model and assigns types to locations. • Two locations (variables) have unique type assignments, unless they HAVE to be of the same type for ALL statements to be well-formed. • Types joined through unification. • Algorithm produces a storage shape graph which can be used to get points-to sets and alias sets.
Claim to fame • The bug detector from SUIF is more orthodox. • Uses a variant of Static Single Assignment forms to compute def-use chains. • def-use chains are analyzed for potential violations such as buffer overruns. • Technique is inter-procedural, flow-sensitive and context-sensitive.
Thanks. If you are still interested and just can’t wait to get your hands on the survey - www.soe.ucsc.edu/~vishwa/publications/Pointers.pdf