490 likes | 716 Views
On the C integer types: Defining the problem and offering a solution . Kevin Krause. Question. x = 10; if (x > -10) { \ do something important … } True False Don’t know. The answer depends on the context. 10.
E N D
On the C integer types: Defining the problem and offering a solution Kevin Krause
Question • x = 10; if (x > -10) { \\ do something important … } • True • False • Don’t know
The answer depends on the context • 10 • binary ( 2 ) • decimal ( 10 ) • octal ( 8 ) • Hexadecimal ( 16 )
The answer depends on the context • If declared x in C as unsigned int x;, then false
The answer depends on the context • If declared x in C as unsigned int x;, then false • -10is converted to an unsigned int • -10becomes4294967286
The answer depends on the context • However, if x in C declared as unsigned char x;,then true
C integers are prone to error Overflow/underflow Sign error Truncations error All 3 can produce unexpected results!
Integer overflow/underflow • When a value is assigned to an integer type and the value > INT(type)_MAX • According to the standard (C99) • For signed integers, the behavior is undefined • The result is unpredictable • Usually, the situation is ignored by compilers • For unsigned integer, the result typically “silently wraps”
Integer overflow/underflow examples signed char over, under; signed char SCHAR_MAX = 127; signed char SCHAR_MIN = -128; over = SCHAR_MAX + 1; // -128 under = SCHAR_MIN - 1; // 127
Integer sign error Is an unexpected sign change Usually occurs when casting between unsigned and signed integer types
Integer sign error example int signed_error; unsigned int unsigned_error; int INT_MIN = -2147483647; unsigned int UINT_MAX = 4294967295; signed_error = UINT_MAX; // -1 unsigned_error = INT_MIN; // 2147483649
Integer truncation error Occurs when a value is assigned to an integer, the value is larger than the largest value that the integer type can hold and the high order bits of the value are lost
Integer truncation error 1111 1111 1111 1111 1111 1111 1111 1111 = 4294967295 // UINT_MAX 1111 1111 = 255 // UCHAR_MAX
Integer truncation error ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ ↓↓↓↓ 1111 1111 1111 1111 1111 1111 1111 1111 0000 0000 1111 1111 1111 1111 1111 1111 1111 1111
Integer truncation error example unsigned char uchar_trunc_error; unsigned int UINT_MAX = 4294967295; uchar_trunc_error = UINT_MAX; // 255
Integer bugs open up vulnerabilities • DoS • Exploit arbitrary code • Privilege escalation • Vulnerability is defined as a set of conditions that allows violation of an explicit or implicit security policy
Common Vulnerability and Exposures: (CVE) reported integer bugs • Prior to 2001, integer bugs were rare • 1 in 19991 in 20003 in 2001 • 39 reported through June 2012 • Steady growth in the 10 yrs preceding 2012 • 1208 reported between 2002 – 2011 (inclusive) • 342 .c files were at fault
Vulnerable platforms Windows
Vulnerable platforms Windows Linux Mac OS X Apple iOS
Vulnerable platforms • Windows • Linux • Mac OS X • Apple iOS • Applications: Google Chrome, Wireshark, Mozilla Firefox, OpenOffice, LaTex, …
w.r.t. C, how did we get there? • Developed between 1969 and 1973 • Coincided with development of Unix • Strengths • Expressiveness • Provides for both low (bitwise) level and high level operations • Robust set of operators and data types • Generally, human readable • Portability • Partially achieved by wholesale casting (both implicit and explicit) between data types • Systems programming language of choice
C’s weaknesses • Lack of bounds checking • Stacks • Array boundaries
C’s weaknesses • Lack of bounds checking • Stacks • Array boundaries • Integer • Difficult to detect after they’ve happened • Compilers generally ignore them • Difficult to avoid • Subtle bugs can result in integer overflows • C is weakly typed and not type safe
Type strength vs. Type safety Type strength – a language characteristic determined by the amount of coercion (casting) permitted between the types Type safety – a requirement that a program has no unspecified behaviors
Type Strength • Strongly typed (Haskell) • Prohibits all operations of mixed types • Nearly strongly typed (Ada) • Generally prohibits all operations of mixed types, however, has overriding functions • Weakly typed (C, C++) • Offer little, if any type checking mechanisms • Un-typed (.asm)
Type safety • Progress implies a well typed program never enters a stuck state; it either enters the next state or terminates • If e :τ, then either (i) e →e’ for some e’ or (ii) e is a terminal value • Preservation implies that the type of an expression remains unchanged after execution • If e :τ and e →e’, then⊢e’, :τ
C casting • Only safe cast is the upcast of a smaller precision type to a larger precision type of the same sign type. • uchar ushort uint ulong ullong
C conversion rules • Integer promotion rank • e.g., _Bool < uchar < ushort < uint < ulong < ullong • Integer promotions • Any int type smaller than int is promoted to an int • Usual arithmetic conversions • The smaller arithmetic type is promoted to the larger arithmetic type
Integers are subject to undefined behaviors • The standard provides for 3 undefined behavior types • Implementation defined behavior • Byte ordering (big-endian or little-endian) • Unspecified and without restrictions • Order of evaluation • Undefined is when a program attempts something semantically invalid • Divide by 0
Type safe C programs Responsibility falls squarely on the programmers’ backs
Mitigation approaches • Safe C dialects (subsets such as Clight) • Generally compiler based • Safe integer libraries • Range checking • Use of safe coding practices • Proposed type changes • Annotated types • Static analysis tools • All approaches rely on typing semantics
Syntax of type The standard defines type as the meaning of a value stored in an object or returned by a function is determined by the type of an expression used to access it. <c_type> := <object_type> | <function_type> | <incomplete_type>
Syntax of type <object_type> := <scalar_type> | <aggregate_type> | <union_type> *In addition to the standard integer types, C also supports both implementation defined extended and platform-specific integer types.
Static typing semantics Type inference rule: given a well syntactically well formed phrase M and a type assignment Γ, find a phrase type Θ such that Γ˫ M : Θ is true From the constraints for all operators (expressions and statements), formulate the inference rules in the general form:
Static typing semantics, e.g., % The operands of the % shall have integer type Γ ⊢e1:exp[τ1] Γ ⊢e2:exp[τ2] isIntegral(τ1)isIntegral(τ2) arithConv⟨τ1, τ2⟩ ::=τ ′ e2 ≠ 0 Γ ⊢e1% e2:exp[τ ′ ]
C type safety analysis tool • ACL2 • AComputational Logic for Applicative Common Lisp • Reasons for language choice • Applicative • Data types not prone to same error conditions as C • e.g., bignums • Proof generating capacity • c2acl2 translation tool
Analysis flow chart Type safety analyzer .c source Lisp model symtab Final report c2acl2
c2acl2 input/output: lisp model .c source lisp model (C2ACL2 (FILE "init_test") ( ";************************************************" "; Function Definition for function: main" ";************************************************" (FUNC (INT )(ID "main" 1 ) NIL (BLOCK (DECL (INT )(ID "x" 2 ) (INIT (ADD (ADD (LIT 3) (LIT 1)) (LIT 1)))) (DECL (INT )(ID "y" 3 ) (INIT (ADD (MULT (LIT 3) (LIT 1)) (LIT 1)))) ) )" ; End of function main" ) ) int main() { int x = 3 + 1 + 1; int y = 3 * 1 + 1; }
c2acl2 output: symtab (C2ACL2_SYMTAB (FILE "init_test") ((ID 3) (NAME "y") (TYPE (INT )) (SCOPE LOCAL)) ((ID 2) (NAME "x") (TYPE (INT )) (SCOPE LOCAL)) ((ID 1) (NAME "main") (TYPE (FUNC (INT ) (ID "main" 1 ) NIL)) (SCOPE GLOBAL)) )
Analysis procedure • Validate all declarations • int x = -10; // ok unsigned int = -10; // gcc error unsigned int = x; // gcc ok • Validate expressions • Validate control statements • Validate functions
Thank you Questions?