261 likes | 273 Views
The AMD64/EM64T Port of Dyninst and Paradyn. Greg Quinn gquinn@cs.wisc.edu Ray Chen rchen@cs.umd.edu. Goals. 64-bit Dyninst library and Paradyn daemon that handle both 32-bit and 64-bit mutatees Leverage as much existing functionality as possible. Talk Outline. 32-Bit Compatibility
E N D
The AMD64/EM64T Port of Dyninst and Paradyn Greg Quinn gquinn@cs.wisc.edu Ray Chen rchen@cs.umd.edu
Goals • 64-bit Dyninst library and Paradyn daemon that handle both 32-bit and 64-bit mutatees • Leverage as much existing functionality as possible
Talk Outline • 32-Bit Compatibility • 64-Bit Mode • Architectural Overview • Issues for Dyninst • Current status and timeline for the port
Problematic Porting • Conceptually simple • ISA extension • Hardware compatibility • Pre-existing code base • Nightly regression tests
System Structures • What’s wrong with this code? struct link_map { /* Base address shared object is loaded at.*/ ElfW(Addr) l_addr; /* Absolute file name object was found in.*/ char *l_name; /* Dynamic section of the shared object.*/ ElfW(Dyn) *l_ld; /* Chain of loaded objects. */ struct link_map *l_next, *l_prev; };
System Structures • Compile-time decisions unacceptable • Structure size depends on target platform • X86: sizeof( ElfW(Addr) ) == 4 • X86_64: sizeof( ElfW(Addr) ) == 8 • Similar problem with pointer data types #define ElfW(type) \ Elf ## __ELF_NATIVE_CLASS ## type
System Structures • No backwards compatible structure • Must create and maintain our own • Multiple structures affected • link_map, r_debug, libelf routines struct link_map_dyn32 { Elf32_Addr l_addr; uint32_t l_name; uint32_t l_ld; uint32_t l_next, l_prev; };
System Structures • Class based solution • Hierarchy with 32-bit and 64-bit siblings • Virtual functions instead of control structures • Multiple benefits • No code duplication • Less source clutter • Minor function call overhead
What Works? • Operation on 32-bit binaries at 95% • Passes most nightly regression tests • Tests 1-12, attach, relocate • Save the World not fully tested • Existing x86 shared library issue
64-Bit Mode Architectural Overview
32-bit Mode: Eight 32-bit registers Registers EAX EBX ECX EDX EBP ESP EDI ESI
32-bit Mode: Eight 32-bit registers 64-bit Mode: Registers extended to 64 bits Registers RAX RBX RCX RDX RBP RSP RDI RSI
32-bit Mode: Eight 32-bit registers 64-bit Mode: Registers extended to 64 bits Eight additional registers Registers RAX RBX RCX RDX RBP RSP RDI RSI R8 R9 R10 R11 R12 R13 R14 R15
Registers • Encoded using REX prefix: 0100 W R X B Determines Width of Operation (32/64) Serve as High Order Bits for Register Numbers
Immediate Values • Variable-length instructions allow for register-sized immediates (8 bytes) • MOV RAX, 0x1234567890abcdef • This is the only way to specify an 8-byte value in an instruction • Most importantly for Dyninst: • there is no JMP w/ 8-byte displacement
Instruction Parsing • x86 instruction parser collects basic block information and searches for instrumentation points • We can use the same parsing algorithm for 64-bit mutatees • Architectural changes are abstracted away by instruction decoding • Bonus: support for stripped binaries
Executing Instrumentation Mutatee Address Space • Dyninst maintains a heap of non-contiguous memory areas in the mutatee • Instrumentation points jump to code in nearby heap region • Code for this already exists (AIX, Solaris) executable library code
Executing Instrumentation Mutatee Address Space • Dyninst maintains a heap of non-contiguous memory areas in the mutatee • Instrumentation points jump to code in nearby heap region • Code for this already exists (AIX, Solaris) executable dyninst heap region library code
Executing Instrumentation Mutatee Address Space • Dyninst maintains a heap of non-contiguous memory areas in the mutatee • Instrumentation points jump to code in nearby heap region • Code for this already exists (AIX, Solaris) executable dyninst heap region >> 4GB spacing library code
Executing Instrumentation Mutatee Address Space • Dyninst maintains a heap of non-contiguous memory areas in the mutatee • Instrumentation points jump to code in nearby heap region • Code for this already exists (AIX, Solaris) executable dyninst heap region >> 4GB spacing library code dyninst heap region
Code Generation • Improved architecture allows for more efficient code generation • Stack no longer used for passing arguments • More registers means stack no longer needed for temporary values
Good Things™ • We have been able to leverage x86 port extensively (code reuse) • Some 32-bit headaches go away • Non-standard optimizations in mutatee code (_dl_open example) • More registers allow for more efficient instrumentation code
Status/Timeline • Now working: • 32-bit support • Instruction decoding, parsing • Left to do: • Code generation • Memory allocation • Counter, timers, and sampling code for Paradyn • Beta release: 2Q05 • Available for partners and friends • Production release: 3Q05