1 / 15

Partial Automation of an Integration Reverse Engineering Environment of Binary Code

Partial Automation of an Integration Reverse Engineering Environment of Binary Code. Author : Cristina Cifuentes Reverse Engineering, 1996., Proceedings of the Third Working Conference on On page(s): 50 - 56 8-10 Nov. 1996 Monterey, CA, USA. Introduction. What’s the problem?

harken
Download Presentation

Partial Automation of an Integration Reverse Engineering Environment of Binary Code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Partial Automation of an Integration Reverse Engineering Environment of Binary Code Author : Cristina Cifuentes Reverse Engineering, 1996., Proceedings of the Third Working Conference on On page(s): 50 - 56 8-10 Nov. 1996 Monterey, CA, USA

  2. Introduction • What’s the problem? • Investment made on software when newer machine is available. • Two points of view for migration of software: • From a commercial view: • Software needs to be available on the new machine at the same time. • From a software developer’s point of view: • Software developed in-house is an investment and asset to an organization. • Software migration is not a trivial problem!!

  3. Four approaches to solve this problem • Use a native compiler to compile the source code for the new platform. • Emulation of old machine’s instructions using micro-code hardware in new machine. • Emulation of old machine’s instructions in software in new machine. • Binary translation

  4. Problems • On using a native compiler to compile the source code: • Compilation requires access to all source code, which may not be feasible. • On Emulation of old machine’s instructions using micro-code hardware • It’s requires special micro-programmable hardware, which is not include in today’s RISC machine. • On Emulation of old machine’s instructions in software • Software emulation is easy to implement but slow.

  5. Structure of a Binary Translator and a De-compiler • Front-end: • The front-end is a machine-dependent module that loads the source binary program, disassembles it, and translates it into an intermediate representation. • Middle-end: • Performs the code analysis for the translation, and performs optimizations on the code • Back-end: • It is a target machine-dependent module that generates code for the target machine

  6. Integrated Reverse Engineering Environment for Binary Code

  7. A Compiler’s Structure

  8. An Integrated Reverse Engineering Environment for Binary Code • Loader • Disassembler • Signature generator • Prototype generator • New Jersey machine-code toolkit (NJMC) • Idiom analyzer • Control flow graph generator • UBM/UDM

  9. Loader • Just like the operating system loader. • Read the binary file by decoding the binary-file format used to store the program, and determine the file’s structure (instructions, tables, symbol tables).

  10. Disassembler • Parses the binary image of the program and translates it to assembler or some equivalent representation. • It parsed starting at the entry point and following all paths from this point. • Analysis address of indexed and indirect jumps or calls

  11. Idiom analyzer • Detect idioms and translates the sequence of instructions into intermediate instructions. • An idiom is a sequence o instructions that has a special meaning that can't be derived from semantics of the individual instructions alone. • Examples: • ARM : • bl foo • X86 • Sub ax,immedLo • Sbb ax,immedHi • = sub dx:ax, immedHi:immedLo

  12. Control flow graph generator • Constructs a control flow graph for each subroutine of the program. • The control flow graph is part of the intermediate representation of any reverse engineering tool that deals with binary code.

  13. Second Generation Tools • Signature generator • Automatically determines library signatures • Prototype generator • Automatically determines the types of the formal arguments of library subroutines, and the type of the return value for functions. • New Jersey machine-code toolkit (NJMC) • Facilitate the decoding of machine instructions by provide a specification language to define machine instructions.

  14. UBM/UDM • Universal binary-translation machine • Generates binary programs for target machine • Universal decompilation machine • Generates high-level language (like C).

  15. Conclusions • This paper presents an integrated environment for the reverse engineering of binary programs. • Such environment is suitable for the development of disassemblers, binary translators and decompilers. • Make retargetable techniques essential in order to develop such tools for a variety of machines rather than for one specific machine.

More Related