340 likes | 363 Views
This shape analysis algorithm determines information about dynamically allocated storage in recursive programs, ensuring pointer variables are not NULL and data structures remain disjoint. The algorithm is applied for various list manipulation procedures, handling recursive procedures while preventing memory leaks and NULL dereference.
E N D
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv
Shape Analysis • Static program analysis • Determines information about dynamically allocated storage • A pointer variable is not NULL • Two data structures are disjoint • The algorithm is Conservative
Applications of Shape Analysis • Cleanness • Dor, Rodeh, Sagiv [SAS2000] • Parallelization • Assmann, Weinhardt [PMMPC93] • Hendren, Nicolau [TPDS90] • Larus, Hilfinger [PLDI88]
Current State • Good Intraprocedural analyses • Sagiv, Reps, Wilhelm [TOPLAS 1998] • Analyze body of list manipulation procedures: • reverse , insert, delete • Expensive, imprecise interprocedural analyses of recursive procedures
Main Results • Interprocedural shape analysis algorithm for programs manipulating linked lists • Handles recursive procedures • Prototype implementation • Successfully analyzed several list manipulating procedures • insert, delete, reverse, reverse_append • Properties verified • An a-cyclic list remains a-cyclic • No memory leaks • No NULL dereference
typedef struct List { int data ; struct List* n ; } *L ; L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t data = s ; l2: t n = create(s-1); return t; } Running Example void main() { L r = NULL; int k; … l1: r = create(k); }
Selected Memory States void main() { L r = NULL; int k; … l1: r = create(k); } exit k=3 r = NULL
1 3 2 NULL NULL NULL Selected Memory States L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); td = s ; l2: t n = create(s-1); return t; } exit k=3 r = NULL l1 s=3 t l2 s=2 t l2 s=1 t l2 s=0 t = NULL
1 3 2 NULL NULL NULL Selected Memory States L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); td = s ; l2: t n = create(s-1); return t; } exit k=3 r = NULL l1 s=3 t l2 s=2 t l2 s=1 t
3 2 NULL Selected Memory States 1 L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); td = s ; l2: t n = create(s-1); return t; } NULL exit k=3 r = NULL l1 s=3 t l2 s=2 t
Selected Memory States 1 3 2 L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); td = s ; l2: t n = create(s-1); return t; } NULL exit k=3 r = NULL l1 s=3 t
Selected Memory States 1 3 2 NULL void main() { L r = NULL; int k; … l1: r = create(k); } exit k=3 r
1 3 2 NULL NULL NULL exit k=3 l1 s=3 l2 s=2 l2 s=1 l2 s=0 Where is the Challenge ? • Dynamic allocation • Unbounded number of objects • Recursion • Unbounded number of activation records • Properties of: • Invisible instances of local variables • Dynamically allocated objects r = NULL t t t t = NULL
Explicit manipulation of the stack • Represent the activation record stack as a linked list: • Control Information • Invisible instances of local variables Our Approach Reduce the interprocedural problem shape analysis problem to an intraprocedural problem Program with procedures Program without procedures
Our Algorithm • Abstract Interpretation • Concrete Semantics: • Concrete representation of memory states • Effect of program statements • Abstract Semantics: • Abstract representation of memory states • Transfer functions • Finds abstract representation of memory states at every program point
csexit t pr t csl1 t pr csl2 pr csl2 pr topcsl2 Concrete Memory Descriptors 1 3 2 NULL NULL NULL exit k=3 r = NULL l1 s=3 t l2 s=2 t l2 s1 t l2 s=0 t = NULL
Concrete Memory Descriptors • Properties of memory elements: • “type”: stack, heap • “visibility”: top • “call-site”: exit, csl1 , csl2 csexit t pr t csl1 t pr • Relationships between memory elements: • value of local variables: t, r • n-successor: n • invoked by: pr csl2 pr csl2 pr topcsl2
Bounding the Representation • Concrete Memory Descriptors represent memory states • Every object is represented uniquely • Abstract Memory Descriptors • Conservatively represent Concrete Memory Descriptors • A bounded representation
Don’t Know top=1/2 t t 3-Valued Properties True False top
csexit t pr t csl1 pr csl2 pr pr csl2 , top Abstraction csexit t pr t csl1 t pr csl2 pr csl2 pr csl2 , top
Bounding the Representation • Summarize nodes according to their unary properties • Join values of relationships • Convert a Concrete Memory Descriptor of arbitrary size into an Abstract Memory Descriptor of bounded size • Does the Abstract Memory Descriptor contain enough information?
pr pr pr Problem exit exit t t pr pr t csl1 csl1 t pr t csl2 csl2 pr csl2 pr csl2 , top csl2 , top
Observing Properties of Invisible Variables • Explicitly track universal properties of invisible-variables • Different invisible instances of t cannot point to the same heap cell • Instrumentation properties • Track derived properties of memory elements
Some Instrumentation Properties • Pointed-to by an invisible instance of t • Pointed by more than one invisible instance of t • t is not NULL
pr pr pr Memory Descriptors with Instrumentation t exit exit t t pr pr csl1 t t csl1 pr csl2 pr csl2 csl2 pr csl2 , top csl2 , top
pr pr pr Problem - solved exit exit t t pr pr t csl1 t csl1 pr t csl2 pr csl2 csl2 pr csl2 , top csl2 , top csl2 , top
Why Does It Work • Shape analysis handles linked list quite precisely (Sagiv, Reps, Wilhelm [TOPLAS98]) • Utilize the (intraprocedural) 3-valued logic framework of Sagiv, Reps and Wilhelm [POPL99] to analyze the resulting intraprocedural problem
Prototype Implementation • Implemented in TVLA [Lev-Ami, Sagiv SAS 2000] • Analyzed some recursive list manipulating programs • Verified cleanness properties: • No memory leaks • No NULL dereferences
Procedure create delAll insert delete search append reverse reverse_append reverse_append _r Running example Prototype Implementation Number of (3VL) Structures 219 139 344 423 303 326 414 797 2285 208 Time (sec) 7.31 12.74 34.61 38.29 8.07 40.64 47.56 95.35 1204.13 16.50
Conclusion • Need to know more than potential values of invisible variables • Tracking properties of invisible variables helps to overcome the (necessary) imprecision summarization of their values • Instrumentation • Generic • Sharing by different instances of a local variable • List specific
Conclusion • Storing the call-site enable to improve information propagation to return-sites • Shows how theintraprocedural framework of Sagiv, Reps and Wilhelm can be used for interprocedural analyses • Analysis of a complex data structure
Limitations • Small programs • No mutual recursion (Implementation) • Predefined instrumentation library • Easy to use, no need for user intervention • Might not be good for all programs
Further Work • Scaling the algorithm • Distinguishing between “relevant context” and “irrelevant” context • Analysis of programs manipulating Abstract Data Types
The End Interprocedural shape analysis for recursive programsNoam rinetzky and Mooly Sagiv Compiler Construction 2001 www.cs.tau.ac.il/~maon