1 / 24

A Framework for Binary Code Analysis, and Static and Dynamic Patching

A Framework for Binary Code Analysis, and Static and Dynamic Patching. Barton P. Miller University of Wisconsin bart@cs.wisc.edu. Jeffrey Hollingsworth University of Maryland hollings@cs.umd.edu. Multi-platform Open architecture Extensible Open source. Testable

sherry
Download Presentation

A Framework for Binary Code Analysis, and Static and Dynamic Patching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller University of Wisconsin bart@cs.wisc.edu Jeffrey Hollingsworth University of Maryland hollings@cs.umd.edu Binary Code Analysis and Editing

  2. Multi-platform Open architecture Extensible Open source Testable Suitable for batch processing Accurate Efficient Motivation • Binary code analysis is a basic tool of security analysts, application developers, system designers and tool developers. • We are designing and building a new foundation to support such analysis. • Existing binary analysis tools have significant limitations. Binary Code Analysis and Editing

  3. Why Binary Code? • Access to the source code often is not possible: • Proprietary software packages. • Stripped executables. • Proprietary libraries: communication (MPI, PVM), linear algebra (NGA), database query (SQL libraries). • Binary code is the only authoritative version of the program. • Changes occurring in the compile, optimize and link steps can create non-trivial semantic differences from the source and binary. • Worms and viruses are rarely provided with source code Binary Code Analysis and Editing

  4. Binary Analysis and Editing • Analysis: processing of the binary code to extract syntactic and symbolic information. • Symbol tables (if present) • Decode (disassemble) instructions • Control-flow information: basic blocks, loops, functions • Data-flow information: from basic register information to highly sophisticated (and expensive) analyses. Binary Code Analysis and Editing

  5. Binary Analysis and Editing • Binary rewriting: static (before execution) modification of a binary program: • Analyze the program and then insert, remove, or change the binary code, producing a new binary. • Dynamic instrumentation: dynamic (during execution) modification of a binary program: • Analyze the code of the running program and then insert, remove, or change the binary code, changing the execution of the program. • Can operate on running programs and servers. Binary Code Analysis and Editing

  6. Uses of Binary Analysis and Editing • Cyber-forensics • Analysis: understand the nature of malicious code • Binary-rewriting: produce a new version of the code that might be instrumented, sandboxed, or modified for study. • Dynamic instrumentation: same features, but can do it interactively on an executing program. • Hybrid static/dynamic: control execution and produce intermediate versions of the binary that can be re-executed (and further instrumented). • Program tracing: instructions, memory accesses, function calls, system calls, . . . • Debugging • Testing • Performance profiling • Performance modeling • Reverse engineering Binary Code Analysis and Editing

  7. Our Starting Point: Dyninst • A machine-independent library for machine level code patching. • Functions for binary code analysis • Functions for binary code patching • Clean abstractions to encapsulate the tool complexity. • Originally designed as part of the Paradyn performance profiling tool, but now widely used in many areas, including cyber-security. Binary Code Analysis and Editing

  8. Dynamic Instrumentation • Does not require recompiling or relinking • Saves time: compile and link times are significant in real systems. • Can instrument without the source code (e.g., proprietary libraries). • Can instrument without linking (relinking is not always possible. • Instrument optimized code. Binary Code Analysis and Editing

  9. Dynamic Instrumentation (con’d) • Only instrument what you need, when you need • No hidden cost of latent instrumentation. • Enables “one pass” tools. • Can instrument running programs (such as Web or database servers) • Production systems. • Embedded systems. • Systems with complex start-up procedures. Binary Code Analysis and Editing

  10. Instrumentation Relocated Instruction(s) The Basic Mechanism ApplicationProgram Trampoline Function foo Binary Code Analysis and Editing

  11. The DynInst Interface • Machine independent representation • Write-once, analyze/instrument-many (portable) • Object-based interface to insert new code: Abstract Syntax Trees (AST’s) • Hides most of the complexity in the API • Easy to build tools: e.g., an MPI tracer: 250 lines of C++ code. Binary Code Analysis and Editing

  12. Machine Independent Code Abstract Syntax Trees: SPARC Code Power Code sethi %hi(ctr) ld [. . .],%o1 add %o1,%o1,1 st %o1,[. . .] incl ctr cau r3,r0,hi%ctrl r4,lo%ctr(r3)addi r4,1(r4)st r4,lo%ctr(r3) IA32 Code Binary Code Analysis and Editing

  13. Basic DynInst Operations • Code query routines: • Find control-flow elements: modules, procedures, loops, basic blocks, instructions • For functions, find entry, exit, call sites. • For loops, find entry, exit, body. • Find data elements: variables and parameters • Call graph (parent/child) queries • Intra-procedural control-flow graph • Other symbol table information, e.g., line numbers. Binary Code Analysis and Editing

  14. Basic DynInst Operations • Code modification routines: • Remove Function Call • Disable an existing function call in the application • Replace Function Call • Redirect a function call to a new function • Replace Function • Redirect all calls (current and future) to a function to a new function. • Wrap Function • Allow the new function to call the replaced one (potentially with all its original parameters). Binary Code Analysis and Editing

  15. Basic DynInst Operations • Process control: • Attach/create process • Monitor process status changes • Callbacks for fork/exec/exit • Inferior (application processor) operations: • Malloc/free • Allocate heap space in application process • Inferior RPC • Asynchronously execute a function in the application. • Load module • Cause a new .so/.dll to be loaded into the application. Binary Code Analysis and Editing

  16. Basic DynInst Operations • Building AST code sequences: • Control structures: if and goto • Arithmetic and Boolean expressions • Get PID/TID operations • Read/write registers and global variables • Read/write parameters and return value • Function call Binary Code Analysis and Editing

  17. Dyninst Automated Testing • A test suite of almost 100 operation-specific tests. • Runs each night on each platform on the nightly build. • Variations for different compilers, languages (C, C++, Fortran), stripped vs. non-stripped code, etc. • Results reported on the web (reachable from paradyn.org or dyninst.org home pages): http://www.paradyn.org/testresults/dyntable.html Binary Code Analysis and Editing

  18. BinInst Design Goals • Tool-kit component architecture for binary analysis and editing • Open source • Open data structure definitions • Machine-independent abstract interfaces • Batch-enabled analyses • Static and dynamic code patching • All major analysis products are exportable • Enhanced testability and accompanying test suites Binary Code Analysis and Editing

  19. Static Editing Scenario (Binary Rewriting) Code Queries and Instrumentation Requests AST Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Idiom Signatures Raw Disassembly Binary Code Analysis and Editing

  20. Interactive Editing Scenario (Static or Dynamic) Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Idiom Signatures Raw Disassembly Binary Code Analysis and Editing

  21. Dynamic Editing Scenario (Dynamic Instrumentation) Code Queries and Instrumentation Requests AST Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Process Control Stack Walker Idiom Signatures Raw Disassembly User Process Binary Code Analysis and Editing

  22. Analysis Scenario VSA Buffer Overrun Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Connector 2 Intra-Proc CFG Idiom Signatures Other Tool Code Surfer Raw Disassembly Binary Code Analysis and Editing

  23. AST Code Queries and Instrumentation Requests Symbol Table Parser PE Symbol Table Dump ELF Call Graph Code Gen Code Parser COFF Binary Code Intra Proc CFG Instr Control Instruction Decoder Idiom Detector IA32 Process Control Idiom Signatures AMD64 Stack Walker Power Raw Disassembly Binary Code Analysis and Editing

  24. Binary Code Analysis and Editing

More Related