240 likes | 390 Views
Efficient Field-Sensitive Pointer Analysis for C. David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/. What is Pointer Analysis?. Determine pointer targets without running program
E N D
Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/
What is Pointer Analysis? • Determine pointer targets without running program • What is flow-insensitive pointer analysis? • One solution for all statements – so precision lost • This is a trade-off for efficiency over precision • This work considers flow-insensitive pointer analysis only int a,b,*p,*q = NULL; p = &a; if(…) q = p; // p{a,b}, q{a,NULL} p = &b;
Pointer analysis via set-constraints • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; r = &b; q = &c; if(...) q = p; else q = r; (program)
Pointer analysis via set-constraints • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p { a } r = &b; // r { b } q = &c; // q { c } if(...) q = p; // q p else q = r; // q r (program) (constraints)
Pointer analysis via set-constraints p q r • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p { a } r = &b; // r { b } q = &c; // q { c } if(...) q = p; // q p else q = r; // q r {a} {b} {c} (program) (constraints) (constraint graph)
Pointer analysis via set-constraints p q r • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p { a } r = &b; // r { b } q = &c; // q { c } if(...) q = p; // q p else q = r; // q r {a} {b} {a,b,c} (program) (constraints) (constraint graph)
Field-Sensitivity p x r q • How to deal with aggregate types ? • Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x {b} {a} {} {}
Field-Sensitivity p x r q • How to deal with aggregate types ? • Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x {b} {a} {a,b} {a,b}
Field-Sensitivity – A simple solution p xf2 xf1 r q • Use a separate node per field for each aggregate • Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p x.f2 = q; // xf2 q r = x.f1; // r xf1 {b} {a} {} {} {}
Field-Sensitivity – A simple solution p xf2 xf1 r q • Use a separate node per field for each aggregate • Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p x.f2 = q; // xf2 q r = x.f1; // r xf1 {b} {a} {a} {b} {a}
Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s { x } p = &(s->f2); // p ? • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}
Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s { x } p = &(s->f2); // p (*s) || f2 • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}
Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 } • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}
Problem – compatible types xf4 xf3 typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2 • First idea – use string concatenation operator || • Casting identical types except for field names • Derivation same as before - but,node xf2 no longer exists! {..} {..}
Problem – compatible types xf4 xf3 typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 } • First idea – use string concatenation operator || • Casting identical types except for field names • Derivation same as before - but,node xf2 no longer exists! {..} {..}
Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s { xf3 } p = &(s->f2); // p s + 1 • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3
Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s { xf3} s { 2 } p = &(s->f2); // p s + 1 • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3
Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s { xf3} s { 2 } p = &(s->f2); // p s + 1 p { 2 } + 1 p { 3 } • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3
Conclusion • Field-sensitive Pointer Analysis • Presented new technique for C language • Elegantly copes with language features • Taking address of field • Compatible types and casting • Technique also handles function pointers without modification • Experimental evaluation over 7 common C programs • Considerable improvements in precision obtained • But, much higher solving times • And, relative gains appear to diminish with larger benchmarks
Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p { r } s = &a; // s { a } q = p; // q p *q = s; // *q s {r} {a} {} {} (program) (constraints) (constraint graph)
Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p { r } s = &a; // s { a } q = p; // q p *q = s; // *q s r s {r} {a} {r} {} (program) (constraints) (constraint graph)
Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p { r } s = &a; // s { a } q = p; // q p *q = s; // *q s r s {r} {a} {r} {} (program) (constraints) (constraint graph)
Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p { r } s = &a; // s { a } q = p; // q p *q = s; // *q s r s {r} {a} {r} {a} (program) (constraints) (constraint graph)