210 likes | 515 Views
CodeSurfer / x86 A Platform for Analyzing x86 Executables. Gogul Balakrishnan , Radu Gruian and Thomas Reps Computer Science Dept., Univ. of Wisconsin GrammaTech , Inc. April, 2005. Contents. Introduction CodeSurfer / x86 Architecture CodeSurfer / x86 Facilities
E N D
CodeSurfer / x86A Platform for Analyzing x86 Executables GogulBalakrishnan, RaduGruian and Thomas Reps Computer Science Dept., Univ. of Wisconsin GrammaTech, Inc. April, 2005
Contents • Introduction • CodeSurfer / x86 Architecture • CodeSurfer / x86 Facilities • CodeSurfer / x86 Limitations • Recent Work
Introduction • Motivation • Ensuring that 3rd-party applications do not perform malicious operations • Issues • Symbol-table and debugging information is either absent • No abstract location information (variables) • Existing binary analysis tools are not capable of dealing with these issues
Introduction • CodeSurfer • Program analysis and inspection tool • Programming API is bundled with the CodeSurfer programmable package
Introduction • IDAPro • Powerful and commercial disassemby toolkit • Provide APIs for its internal plug-ins
Introduction • CodeSurfer / x86 • Prototype system for analyzing x86 executables • Combine Value-Set Analysis(VSA) with facilities provided by the IDAPro and CodeSurfer toolkits • Recover Intermediate Representations(IR) of programs using VSA • Provide a platform for investigating the properties and behaviors of potentially malicious code
CodeSurfer / x86 Architecture • Overall Architecture
CodeSurfer / x86 Architecture • Value-set Analysis(VSA) • Purpose • Over-approximate possible range of values at each program point each memory Location(registers, stack...) might store • Description • Separate address space into a set of disjoint areas • Memory Locations are represented as a-locs • Ex) EAX -> (ㅗ, 4[0, 1]-20, ㅜ) means that EAX may not contain any meaningful value in Global Environment , may have value 4 * [0, 1] – 20 + ESP in some Local Environment and be able to have any value in some other Local Environment
CodeSurfer / x86 Architecture • IDAPro • Input • x86 Executable • Process • Disassemble x86 binary executable • Analyze static information • Output • Assembly code • Control Flow Graphs(CFGs) • Procedure boundaries • Statically known memory addresses and offsets
CodeSurfer / x86 Architecture • Connector – Parsing • Process • Parse input data into connector’s data structures for VSA • Output • Parsed Data which keeps whole information intact
CodeSurfer / x86 Architecture • Connector – Abstraction • Process • Value-set Analysis – a-locs • Output • Parsed Data with Abstract Information including a-locs with value-sets
CodeSurfer / x86 Architecture • Connector – Augmentation • Process • Augment incomplete(indirect jumps, indirect calls) call graph and CFGs using each program point’s a-locs and value-sets • Output • Code Surfer compatible format data(IRs)
CodeSurfer / x86 Architecture • CodeSurfer • Input • Code Surfer compatible format Data • Output • Collection of IRs, consisting of Abstract Syntax Tree, CFGs, call graph, System Dependence Graph(SDG)
CodeSurfer / x86 Architecture • Overall Architecture (revisit)
CodeSurfer / x86 Facilities • Standard Compilation Model Check • Checkpoints • Runtime Stack • Self-modification • Separation of Program’s Data • If it cannot be confirmed that the executable conforms to the model, then the IR is possibly incorrect
CodeSurfer / x86 Facilities • CodeSurfer’s GUI • SDG Browser • CodeSurfer’s API • Access lower-level information • individual nodes and edges of the program’s SDG • Call graph • CFGs • Conjunction with GrammaTech’s Path Inspector • Detect possibly problematic paths
CodeSurfer / x86 Limitations • Limitations • Dynamically Determined Information • IDAPro and VSA cannot fully recover dynamically determined information such as heap-allocated data, indirect calls, and indirect jumps • Complex Data Structure • Recover only very coarse information about arrays • Value-sets are only suitable for congruence, contiguous data structure