1 / 20

Run-Time Type Checking for Pointers and Arrays in C

Run-Time Type Checking for Pointers and Arrays in C. Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To. What are we doing?. Add run-time checks to C programs Catch pointer and array errors Minimal user effort More effort yields more performance Make C “feel” as safe as Java.

whitley
Download Presentation

Run-Time Type Checking for Pointers and Arrays in C

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Run-Time Type Checkingfor Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To OSQ Retreat

  2. What are we doing? • Add run-time checks to C programs • Catch pointer and array errors • Minimal user effort • More effort yields more performance • Make C “feel” as safe as Java OSQ Retreat

  3. Motivation • 50% of software errors are due to pointers • 50% of security errors due to buffer overruns • Such errors are often hard to reproduce • Difficult to locate true source of errors OSQ Retreat

  4. Overview • Motivation and System Goals • Checkable Errors • Run-Time Representation • Static Analysis • Preliminary Results • Future Work OSQ Retreat

  5. Goals • Support existing C code • Compatibility with external libraries • Handle GCC/MSVC source, Makefiles • Efficiency: 50% overhead rather than 1000% • Research: 5x, Purify 10x, BoundsChecker 150x • Default: many checks • Reduce by static analysis and/or user annotations OSQ Retreat

  6. Checkable Errors • Array and pointer bounds checks • Well-understood • Dereferencing a non-pointer (or NULL) • Complicated by casts and unions • Pointer arithmetic outside of object bounds • Not always caught by Purify, etc. • Freeing non-pointers, using freed memory OSQ Retreat

  7. Required Information • Checks require information about pointers • Length, base, capabilities, etc. • Can be stored in a global table • High table-lookup overhead: 500% • Can be stored with each pointer • struct { Foo *p; Foo *base; Foo *end; } SafeFoo • Library compatibility is tricky OSQ Retreat

  8. Must keep track of which locations are valid pointers Use per-object tags (like in GC) int **X; int *Y; *Y = 55; // OK X = Y; printf(“%d”,**X); // CRASH! More is Needed: Tags OSQ Retreat

  9. Run-Time Representation • Associate with each object in memory: • Base (lower bound), End (upper bound) • Tags (bitfield: 1 bit per word: is it a valid pointer?) • Checks bounds on every access, check tags on pointer reads, set tags on every write • Example: struct { int x; int *y; } *p; base p 0 1 end x y tags OSQ Retreat

  10. Kinds of Pointers • Many pointers only move forward (no casts) • Notably C strings: for (; *p; p++) if *p==‘c’ … • Such “forward” pointers need only an end bound • Many pointers are not involved in evil casts • But may use pointer arithmetic: arrays • Such “index” pointers need not carry tags OSQ Retreat

  11. Kinds of Pointers • Many pointers are completely “safe” • No evil casts, no arithmetic, etc. • e.g., FILE * fin = fopen(“input”, “r”); • These can be represented without any extra information (just a NULL check when used) • These cases yield better performance! OSQ Retreat

  12. Physical Subtyping • Define a formal notion of representation equality and subtyping for casts • Keep pointers and scalars separate! • Intuition: struct {char a[4];} = struct {int x;} struct {char a[4];} ¹ struct {int *x;} struct {int a; int b;} £ struct {int a;} OSQ Retreat

  13. Extended Type System • Simplified C types: • t ::= int | t ref q | t1t2 | t1+t2 -- Types • q ::= safe | string | seq | wild -- Qualifiers • safe = one word: standard C pointer • seq = three words: pointer, base, end • wild = two words: pointer, base, end, tags OSQ Retreat

  14. Type System (continued) • t ref wild, t must contain only wild pointers • May cast between safe and seq • wild may only be cast to or from wild • Physical equivalence: • shortshort = int a(b+c)= (ab)+(ac) • Width subtyping: t1t2£t1 OSQ Retreat

  15. Some Typing Rules address of field pointer arithmetic OSQ Retreat

  16. Handling Casts cast between pointers When is t1 ref q1£t2 ref q2 ? OSQ Retreat

  17. Static Analysis & Inference • For every pointer in the program • Try to infer the fastest safe representation • This is like eliminating classes of run-time checks we know will never fail • Can be formulated as constraint-solving • Apply subtyping rules to casts to get constraints • O(E) where E is number of casts/assignments (flow insensitive) OSQ Retreat

  18. Preliminary Results OSQ Retreat

  19. Future Work • Encode type information at run-time • More expressive casts with low overhead • More complete handling of function pointers • Handle C polymorphism • Uses void*, requires vast overhead • Efficient memory management • GC (or something else) takes free() as a hint OSQ Retreat

  20. Conclusion • Can add efficient run-time checks to C • Check bounds, valid pointers, frees, etc. • Static analysis is fast and useful • Can support existing C code • Whole programs are considered • safe pointers and wrappers for libraries • Default to many checks, infer them away OSQ Retreat

More Related