540 likes | 614 Views
Source Analysis for Security. Trent Jaeger March 29, 2004. Example 1. Example 2. get_free_buffer (struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next;
E N D
Source Analysis for Security Trent Jaeger March 29, 2004
Example 2 get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; }
Example 4 int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; … if (inode->i_op && inode->i_op->setattr) { error = security_inode_setattr(dentry, attr); if (!error) error = inode->i_op->setattr(dentry, attr); … }
Find Software Bugs • Education • Difficult to know how code will be used • Testing • Misses many code paths, time consuming • Manual Inspection • Tedious and error prone • Compiler checking • Context independent • 4GL • Incomplete and don’t know how source code will be used • Assurance • Extremely costly and complex – what do we do about existing code?
Limited Source Code Analysis • Source code is the level security is defined • Problems manifest in errors in code (although design can be a problem too) • Compilers can check for various properties • Rules on program source • Programmers can express some properties • Semantic properties • Must specify correctly (no/few false negatives) • Must not be too conservative (few false positives) • Like to be robust with code changes
Source Code Analysis • Covert source code into a model • Convert property into a computation on model • Report positive cases (violate/meet property) • Determine if cases are true or false • Resolve true cases • Refine model or property and repeat
Some Properties • Never/always do X • Never use floating point in kernel • Do X rather than Y • Always do X before/after Y • LSM mediation (Example 1) • Never do X before/after Y • In situation X, do (not) Y • Re-enable disabled interrupts (Example 2) • In situation X, do Y rather than X
Program Models • Abstract Syntax Tree • Control flow • Data flow • Def-use chain • Aliases • Type constraints • …
Abstract Syntax Tree Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Expr_stmt = call_decl do_fcntl Var_decl filp call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Do_fcntl Func_decl Fcntl_setlk Expr_stmt = var_decl Struct file *filp Expr_stmt = cmpd_stmt Use filp Var_decl err Call_stmt Fcntl_setlk(fd) Var_decl filp call_decl Fget(fd)
Control Flow (Interprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Expr_stmt = call_decl do_fcntl Var_decl filp call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Do_fcntl Func_decl Fcntl_setlk Expr_stmt = var_decl Struct file *filp Expr_stmt = cmpd_stmt Use filp Var_decl err Call_stmt Fcntl_setlk(fd) Var_decl filp call_decl Fget(fd)
Control Flow (Intraprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Expr_stmt = call_decl do_fcntl Var_decl filp call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Do_fcntl Func_decl Fcntl_setlk Expr_stmt = var_decl Struct file *filp Expr_stmt = cmpd_stmt Use filp Var_decl err Call_stmt Fcntl_setlk(fd) Var_decl filp call_decl Fget(fd)
Data Flow Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Expr_stmt = call_decl do_fcntl Var_decl filp call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Do_fcntl Func_decl Fcntl_setlk Expr_stmt = var_decl Struct file *filp Expr_stmt = cmpd_stmt Use filp Var_decl err Call_stmt Fcntl_setlk(fd) Var_decl filp call_decl Fget(fd)
Def-Use Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Expr_stmt = call_decl do_fcntl Var_decl filp call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Do_fcntl Func_decl Fcntl_setlk Expr_stmt = var_decl Struct file *filp Expr_stmt = cmpd_stmt Use filp Var_decl err Call_stmt Fcntl_setlk(fd) Var_decl filp call_decl Fget(fd)
Property Models • Finite State Automata • Start Operation • Disable Interrupts • Enable Interrupts • End Operation • Type Constraints • Unchecked type • Checked type • Expect checked type enable disable disable enable End Op Exit w/ disabled double_disable double_enable
CQUAL Static Analysis • CQUAL is a type-based static analysis tool from UC Berkeley • Enables qualification of types, analogous to const • Enables verification that the type passed to a function is the type expected • Used previously for verification of format string vulnerabilities • Wagner’s group at UC Berkeley in USENIX Security 2001
CQUAL Principles • Interprocedural control flow • do_fcntl calls fcntl_getlk • Def-Use data flow • Assignments tracked back to def where type is declared • Type inference • Variables have type restrictions • Cannot assign a variable to another of an incompatible type • Cannot send a variable as a parameter to a function unless its type is compatible
Sensitivity: Flow and Context • Flow-sensitivity • The order of statements in a function matters • CQUAL is not flow-sensitive • Must create new ‘checked’ variable • Must use GCC to verify intraprocedural paths • Must use GCC to find reassignments after ‘checked’ • Context-sensitivity • A function is treated differently depending on calling site • CQUAL is not context-sensitive • If two functions call the same descendant must have the same requirements in CQUAL
CQUAL Postscript • Flow-sensitive CQUAL • Initial performance was not good • Field level data flow • Extensions at UC Berkeley • We switched to new tool (JaBA) • Interprocedural control flow • Intraprocedural control flow (flow-sensitive) • Context-sensitive • Variable and field-level data flow • Replicated analyses of Example 1 and 3 while preventing false positives of Example 4
Meta-compilation • Compilers • Have program source • Can implement straightforward rules for source checking • Lack domain semantics of programs • Programmers • Have domain semantics of programs • Need a means to express these semantics such that they can be checked
Meta-compilation • Model • GCC abstract syntax tree • Compute interprocedural control flow graph • Compute intraprocedural control flow graph • Properties • Finite state automata • Generate extensions from specification • Computation • FSA state transitions are represented by patterns • Find syntactic patterns in code • Build intraprocedural paths with relevant state changes • For each path, compute resultant state transitions
Properties: Meta Language (metal) • { #include “linux-includes.h” } • sm check_interrupts { • // Variables used in patterns • decl { unsigned } flags; • // Patterns to specify enable/disable fns • pat enable = { sti(); } • | { restore_flags(flags); } ; • pat disable = { cli() }; • // States – implicit initial state • is_enabled: disable is_disabled • enable { err(“double enable”); } ; • is_disabled: disable { err(“double disable”); } • | $end of path$ { err(“exiting w/ intr disabled”); } enable disable disable enable End Op Exit w/ disabled double_disable double_enable
Example 2 Processing get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; } disable end of path err enable end of path
Meta-Compilation System • Compile Metal State Machine (SM) with mcc • Dynamically link SM into xg++ • Compile-time, command line flag • It is “pushed down” “both paths” • Paths are built and checked against SM • All paths vs one pass (flow-sensitive vs. insensitive) • Prune paths that reach join in same state • Fixed point: loop until reach all possible paths
Prune Paths Choice of paths does not matter, so only one needs to be kept disable enable
Assertion Checking – Side Effects • { #include “linux-includes.h” } • sm Assert flow-insensitive { • // Match expressions • decl { any } expr, x, y, z; • decl { any_call } any_fcall; • decl { any_args } args; • // States: find asserts and detect side effects • start: { assert(expr); } • {mgk_expr_recurse(expr, in_assert); } ; • in_assert: { any_fcall(args) } { err(“fn call”); } • | { x = y } { err(“assignment”); } • | { z++ } { err(“post-increment”); } • | { z-- } { err(“post-decrement”); }
xgcc Extension (PLDI 2002) • Match patterns to statements • Identify state transitions • Compute intraprocedural paths • Prune those that cannot matter (no state changes) • Combine intraprocedural paths into complete paths • Analysis instance based on a transition from a start state • Paths are generated for each instance • Assignments result in creating a new instance that is a copy
Checking memory management allocation unknown Conditional check on ptr implying not null Conditional check on ptr implying null free, dereference dereference null not-null end path overwrite free, dereference free free freed stop
Checking memory management • Intraprocedural control flow • Distinguish between paths with null and non-null pointers • Interprocedural control flow • “Global analysis” done in PLDI by combining intraprocedural paths • Data flow • None, pure syntactic comparison • Assignment does result in replication of state machine for assigned variable • Finds bugs, but does not guarantee absence • No track of assignment to a structure field • No Aliases • False positives • Syntactic path-sensitivity keeps them moderate
Other Example Analyses • Example 3 – (check fcntl and set_fowner) • If we know the required authorizations for each operation, we can define the states of these ops • Don’t know this (tedious to specify) • We use a consistency analysis (ACM TISSEC, May 2004) • Example 4 – (distinguish between dentryinode and inode) • Specify that { inode = dentryinode } links inode state with dentry state • Note that this does not compute from 1st principles, so manual effort is required to ensure it is correct
xgcc Postscript • Lots of papers on finding bugs using these techniques • Lots of simple errors in code • Other aspects • Automating annotation • Statistical analysis • Coverity, Inc.
GCC Architecture • Compilers for C, C++, Java • Consists of a sequence of compilation steps all of which can be hooked (3.0 and greater) • Eventually, has a single representation of all (gimple) • Then converts to Register Transfer Language (RTL) at which point all typing is lost
MOPS • Aim to provide a ‘sound’ analysis architecture • That is, no false negatives for their model • Program model • Pushdown automata of program • Property model • Finite state automata of security property • Temporal properties • Like xgcc, there is no real data flow analysis • Unlike xgcc, language for properties is not defined
Formal Basis • FSA M accepts a language of security property violations B • All operation sequences that obey M violate security property • PDA P accepts all feasible program traces T • Traces are interprocedural combination of intraprocedural control flow paths • Note that traces are control flow representation • Problem: Decide if any trace violates security property • As whether T 3 B = null • Represented by L(M) 3 L(P) = null • Intersection of PDA and FSA can be computed efficiently • Note that T` L(P), so some infeasible traces are in L(P)
Example 2 enable get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; } disable disable enable End Op Exit w/ disabled double_disable double_enable
assign zero, free check assign use use unmediated Unassigned Use Example 1 assign check use unmediated
MOPS Distinguishing Features • Modularity • Can create a hierarchy of FSAs • Haven’t seen this used… • Pattern variables • “bound to any expression that satisfies context constraints” • Difference from xgcc patterns? • Modeling • PDA and FSA a combined into a composite PDA that accepts L(M) 3 L(P) • Can determine all the FSA states that an instruction can be executed in
Modeling OS for MOPS • Find all kernel variables that affect security • Done manually • Determine the states in the FSA for each • Done manually • Determine transitions between states • Transition in FSA • Automated state space explorer • Execute all paths and create transitions automatically
Setuid • Variable euid determines privilege • Euid can be modified by several functions: • setuid, seteuid, setreuid, setresuid • Value of euid depends on value of other variables on input to these system calls • ruid, suid • cap_effective, cap_permitted • Are found manually • Transitions indicate system calls that lead to changes in variables