1 / 105

Part 2: Advanced Static Analysis

Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly Chapter 7: Analyzing Malicious Windows Programs. Part 2: Advanced Static Analysis. Chapter 4: A Crash Course in x86 Disassembly. How software works.

fecteau
Download Presentation

Part 2: Advanced Static Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly Chapter 7: Analyzing Malicious Windows Programs Part 2: Advanced Static Analysis

  2. Chapter 4: A Crash Course in x86 Disassembly

  3. How software works • gcc compiler driver pre-processes, compiles, assembles and links to generate executable • Links together object code (i.e. game.o) and static libraries (i.e. libc.a) to form final executable • Links in references to dynamic libraries for code loaded at load time (i.e. libc.so.1)‏ • Executable may still load additional dynamic libraries at run-time Pre- processor Compiler Assembler Linker hello.c hello.i hello.s hello.o hello Program Source Modified Source Assembly Code Object Code Executable Code

  4. Executables • Various file formats • Linux = Executable and Linkable Format (ELF)‏ • Windows = Portable Executable (PE)

  5. ELF Object File Format • ELF header • Magic number, type (.o, exec, .so), machine, byte ordering, etc. • Program header table • Page size, virtual addresses of memory segments (sections), segment sizes, entry point • .text section • Code • .data section • Initialized (static) data • .bss section • Uninitialized (static) data • “Block Started by Symbol” 0 ELF header Program header table (required for executables)‏ .text section .data section .bss section .symtab .rel.text .rel.data .debug Section header table (required for relocatables)‏

  6. ELF Object File Format (cont)‏ • .rel.text section • Relocation info for .text section • Addresses of instructions that will need to be modified in the executable • Instructions for modifying. • .rel.data section • Relocation info for .data section • Addresses of pointer data that will need to be modified in the merged executable • .symtab section • Symbol table • Procedure and static variable names • Section names and locations • .debug section • Info for symbolic debugging (gcc -g)‏ 0 ELF header Program header table (required for executables)‏ .text section .data section .bss section .symtab .rel.text .rel.data .debug Section header table (required for relocatables)‏

  7. PE (Portable Executable) file format • Windows file format for executables • Based on COFF Format • Magic Numbers, Headers, Tables, Directories, Sections

  8. Example C Program m.c a.c extern int e; int *ep=&e; int x=15; int y; int a() { return *ep+x+y; } int e=7; int main() { int r = a(); exit(0); }

  9. Merging Relocatable Object Files into an Executable Object File Relocatable Object Files Executable Object File 0 system code .text headers .data system data system code main()‏ .text a()‏ main()‏ .text m.o more system code .data int e = 7 system data .data int e = 7 a()‏ int *ep = &e .text int x = 15 .bss a.o .data int *ep = &e int y int x = 15 .symtab .debug .bss int y

  10. Program execution • Operating system provides • Protection and resource allocation • Abstract view of resources (files, system calls)‏ • Virtual memory • Uniform memory space abstraction for each process • Gives the illusion that each process has entire memory space

  11. How does a program get loaded? • The operating system creates a new process. • Including among other things, a virtual memory space • System loader • Loads the executable file from the file system into the memory space • Done via DMA (direct memory access)‏ • Executable contains code and statically link libraries • Executable in file system remains and can be executed again • Loads dynamic shared objects/libraries into memory space • Done via DMA from file system as with original executable • Resolves addresses in code (using .rel.text and .rel.data information) based on where code/data is loaded • Starts a thread of execution running based on specified entry point in ELF/PE header

  12. Loading Executable Binaries Executable object file for example program p 0 ELF header Virtual addr Process image Program header table (required for executables)‏ 0x080483e0 init and shared lib segments .text section .data section 0x08048494 .text segment (r/o)‏ .bss section .symtab .rel.text 0x0804a010 .data segment (initialized r/w)‏ .rel.data .debug 0x0804a3b0 Section header table (required for relocatables)‏ .bss segment (uninitialized r/w)‏

  13. Example: Linux virtual memory space (32-bit) 0xffffffff kernel virtual memory (code, data, heap, stack)‏ memory invisible to user code 0xc0000000 user stack (created at runtime)‏ %esp (stack pointer)‏ memory mapped region for shared libraries 0x40000000 brk run-time heap (managed by malloc)‏ read/write segment (.data, .bss)‏ loaded from the executable file read-only segment (.init, .text, .rodata)‏ 0x08048000 unused 0 cat /proc/self/maps

  14. Relocation • Virtual memory abstraction makes compilation and linking easy • Compared to a single, shared real memory address space (e.g. original Mac) • Linker statically binds all program code and data to absolute virtual addresses • Linker decides entire memory layout at compile time • Example: Windows ".com" format effectively a memory image • Issues • Support dynamic libraries to avoid statically linking things like libc into all processes. • Dynamic libraries might want to be loaded at the same address! • Need to support relative addressing and relocation again • Want to support address-space layout randomization • Security defense mechanism requiring everything to be relocatable • What Meltdown/Spectre malware might attack first

  15. More on relocation • Relocation in Windows PE (.exe) and Linux ELF • Requires position-independent code • Compiler makes all jumps and branches relative to current location or relative to a base register set at run-time • Compiler labels any accesses to absolute addresses and has loader rewrite them to their actual run-time values • Compiler uses indirection and dynamically generated offset tables to determine addresses • Example: Procedure Link and Global Offset Tables in ELF • GOT contains addresses where imported library calls are loaded at run-time • Library calls index GOT to determine location to jump to • Note: Can be targetted by malware for hooks!

  16. Program execution CPU Memory Addresses Registers E I P Object Code Program Data OS Data Data Program-Visible State • EIP - Instruction Pointer • a. k. a. Program Counter • Address of next instruction • Register File • Heavily used program data • Condition Codes • Store status information about most recent arithmetic operation • Used for conditional branching Memory • Byte addressable array • Code, user data, OS data • Includes stack used to support procedures Condition Codes Instructions Stack

  17. IA32 Register file 31 15 8 7 0 %ax %ah %al %eax %cx %ch %cl %ecx %dx %dh %dl %edx General purpose registers (mostly)‏ %bx %bh %bl %ebx %esi %si %edi %di Stack pointer %esp %sp Special purpose registers Frame pointer %ebp %bp

  18. Registers • The processor operates on data in registers (usually)‏ • movl (%eax), %ecx • Fetch data at address contained in %eax • Store in register %ecx • movl $array, %ecx • Move address of variable array into %ecx • Typically, data is loaded into registers, manipulated or used, and then written back to memory • The IA32 architecture is "register poor" • Few general purpose registers • Source or destination operand is often memory locations • Makes context-switching amongst processes easy (less register-state to store)‏

  19. Operand types • A typical instruction acts on 1 or more operands • addl %ecx, %edx adds the contents of ecx to edx • Three general types of operands • Immediate • Like a C constant, but preceded by $ • e.g., $0x1F, $-533 • Encoded with 1, 2, or 4 bytes based on instruction • Register: the value in one of the 8 integer registers • Memory: a memory address • There are many modes for addressing memory

  20. Operand examples using mov Source Destination C Analog movl $0x4,%eax temp = 0x4; Reg Imm movl $-147,(%eax)‏ *p = -147; Mem movl %eax,%edx temp2 = temp1; Reg movl Reg Mem movl %eax,(%edx)‏ *p = temp; Mem Reg movl (%eax),%edx temp = *p;

  21. Addressing Modes • Immediate and registers have only one mode • Memory on the other hand needs many (so that a load from memory can take a single instruction) • Absolute • specify the address of the data • Indirect • use register to calculate address • Base + displacement • use register plus absolute address to calculate address • Indexed • Add contents of an index register • Scaled index • Add contents of an index register scaled by a constant

  22. Addressing Modes • Absolute • Indirect • Base + displacement • Indexed • Scaled Index movl 0x08049000, %eax movl (%edx), %eax movl 8(%ebp), %eax movl (%ecx, %edx), %eax movl (%ecx, %edx, 4), %eax

  23. x86 instructions • Rules • Source operand can be memory, register or constant • Destination can be memory or register • Only one of source and destination can be memory • Source and destination must be same size

  24. What’s the "l" for on the end? • movl 8(%ebp),%eax • It stands for “long” and is 32-bits • Size of the operands • Baggage from the days of 16-bit processors • For x86, x86_64 • 8 bits is a byte (movb) • 16 bits is a word (movw) • 32 bits is a double or long word (movl) • 64 bits is a quad word (movq)

  25. Global vs. Local variables • Global variables stored in either .data or .bss section of process • Local variables stored on stack • Which variables? m.c a.c extern int e; int *ep=&e; int x=15; int y; int a() { return *ep+x+y; } int e=7; int main() { int r = a(); exit(0); }

  26. Global vslocal: Which is which? void a() { intx = 1; inty = 2; x = x+y; printf("Total = %d\n",x); } int main() {a();} int x = 1; int y = 2; void a() { x = x+y; printf("Total = %d\n",x); } int main(){a();} 080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: mov -0x4(%ebp),%eax 80483db: add %eax,-0x8(%ebp) 80483de: mov -0x8(%ebp),%eax 80483e1: mov %eax,0x4(%esp) 80483e5: movl $0x80484f0,(%esp) 80483ec: call 80482dc <printf@plt> 80483f1: leave 80483f2: ret 080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x8,%esp 80483ca: mov 0x804966c,%edx 80483d0: mov 0x8049670,%eax 80483d5: lea (%edx,%eax,1),%eax 80483d8: mov %eax,0x804966c 80483dd: mov 0x804966c,%eax 80483e2: mov %eax,0x4(%esp) 80483e6: movl $0x80484f0,(%esp) 80483ed: call 80482dc <printf@plt> 80483f2: leave 80483f3: ret

  27. Arithmetic operations 08048394 <f>: 8048394: pushl%ebp 8048395: movl%esp,%ebp 8048397: subl$0x10,%esp 804839a: movl $0x0,-0x8(%ebp) 80483a1: movl $0x1,-0x4(%ebp) 80483a8: addl $0xb,-0x8(%ebp) 80483ac: movl-0x4(%ebp),%eax 80483af: subl%eax,-0x8(%ebp) 80483b2: subl $0x1,-0x8(%ebp) 80483b6: addl $0x1,-0x4(%ebp) 80483ba: leave 80483bb: ret void f(){ int a = 0; int b = 1; a = a+11; a = a-b; a--; b++; } int main() { f();}

  28. Condition codes • The IA32 processor has a register called eflags • (extended flags) • Each bit is a flag, or condition code CF Carry Flag SF Sign Flag ZF Zero Flag OF Overflow Flag • As programmers, we don’t write to this register and seldom read it directly • Flags are set or cleared by hardware on each arithmetic/logical operation depending on the result of an instruction • Conditional branches handled via EFLAGS

  29. Condition codes (cont.) • Setting condition codes via compare instruction cmplb,a • Computes a-b without setting destination • CF set if carry out from most significant bit • Used for unsigned comparisons • ZF set if a == b • SF set if (a-b) < 0 • OF set if two’s complement overflow • (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0) • Byte and word versions cmpb, cmpw

  30. Condition codes (cont.) • Setting condition codes via test instruction testlb,a • Computes a&bwithout setting destination • Sets condition codes based on result • Useful to have one of the operands be a mask • Often used to test zero, positive testl %eax, %eax • ZF set when a&b == 0 • SF set when a&b < 0 • Byte and word versions testb, testw

  31. void f(){ intx = 1; inty = 2; if (x==y) printf("x equals y.\n"); else printf("x is not equal to y.\n"); } int main() { f();} if statements 080483c4 <f>: 80483c4: pushl%ebp 80483c5: movl%esp,%ebp 80483c7: subl$0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: movl-0x8(%ebp),%eax 80483db: cmpl-0x4(%ebp),%eax 80483de: jne 80483ee <f+0x2a> 80483e0: movl $0x80484f0,(%esp) 80483e7: call 80482d8 <puts@plt> 80483ec: jmp 80483fa <f+0x36> 80483ee: movl $0x80484fc,(%esp) 80483f5: call 80482d8 <puts@plt> 80483fa: leave 80483fb: ret

  32. if statements • Note: Microsoft assembly and reverse operand order int a = 1, b = 3, c; if (a > b) c = a; else c = b; movdwordptr [ebp-4],1 ; store a = 1 movdwordptr [ebp-8],3 ; store b = 3 moveax,dwordptr [ebp-4] ; move a into EAX register cmpeax,dwordptr [ebp-8]; compare a with b (subtraction) jle 00000036 ; if (a<=b) jump to line 00000036 movecx,dwordptr [ebp-4] ; else move 1 into ECX register && movdwordptr [ebp-0Ch],ecx ; move ECX into c (12 bytes down) && jmp 0000003C ; unconditional jump to 0000003C movedx,dwordptr [ebp-8] ; move 3 into EDX register && movdwordptr [ebp-0Ch],edx ; move EDX into c (12 bytes down)

  33. Loops int factorial_do(int x) { int result = 1; do { result *= x; x = x-1; } while (x > 1); return result; } factorial_do: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax .L2: imull %edx, %eax decl %edx cmpl $1, %edx jg .L2 leave ret

  34. C switch statements switch (x) { case 1: case 5: code at L0 case 2: case 3: code at L1 default: code at L2 }

  35. C switch statements • Implementation options • Series of conditionals • testl followed by je • OK if few cases and large ranges of values • Slow if many cases • Jump table (example below) • Lookup branch target from a table • Possible with a small range of integer constants • GCC picks implementation based on structure • Example: switch (x) { case 1: case 5: code at L0 case 2: case 3: code at L1 default: code at L2 } .L3 .L2 .L0 .L1 .L1 .L2 .L0 1. init jump table at .L3 2. get address at .L3+4*x 3. jump to that address

  36. Example int switch_eg(int x) { int result = x; switch (x) { case 100: result *= 13; break; case 102: result += 10; /* Fall through */ case 103: result += 11; break; case 104: case 106: result *= result; break; default: result = 0; } return result; }

  37. intswitch_eg(int x) { int result = x; switch (x) { case 100: result *= 13; break; case 102: result += 10; /* Fall through */ case 103: result += 11; break; case 104: case 106: result *= result; break; default: result = 0; } return result; } leal -100(%edx),%eax cmpl $6,%eax ja .L9 jmp *.L10(,%eax,4) .p2align 4,,7 .section .rodata .align 4 .align 4 .L10: .long .L4 .long .L9 .long .L5 .long .L6 .long .L8 .long .L9 .long .L8 .text .p2align 4,,7 .L4: leal (%edx,%edx,2),%eax leal (%edx,%eax,4),%edx jmp .L3 .p2align 4,,7 .L5: addl $10,%edx .L6: addl $11,%edx jmp .L3 .p2align 4,,7 .L8: imull %edx,%edx jmp .L3 .p2align 4,,7 .L9: xorl %edx,%edx .L3: movl %edx,%eax leave ret Key is jump table at L10 Array of pointers to jump locations

  38. Avoiding conditional branches • Modern CPUs with deep pipelines • Instructions fetched far in advance of execution • Mask the latency going to memory • Problem: What if you hit a conditional branch? • Must predict which branch to take and speculatively fetch/execute! • Branch prediction in CPUs well-studied, fairly effective (except when it's not… ) (1/2018) • But, best to avoid conditional branching altogether

  39. x86 REP prefixes • Loops require decrement, comparison, and conditional branch for each iteration • Incur branch prediction penalty and overhead even for trivial loops • Repeat instruction prefixes (REP, REPE, REPNE) • Inserted just before some instructions (movsb, movsw, movsd, cmpsb, cmpsw, cmpsd) • REP (repeat for fixed count) • Direction flag (DF) set via cld and std instructions • esi and edi contain pointers to arguments • ecx contains counts • REPE (repeat until zero), REPNE (repeat until not zero) • Used in conjuntion with cmpsb, cmpsw, cmpsd

  40. x86 REP example • .data source DWORD 20 DUP (?) target DWORD 20 DUP (?) • .code cld ; clear direction flag = forward movecx, LENGTHOF source movesi, OFFSET source movedi, OFFSET target rep movsd

  41. x86 SCAS • Repeat a search until a condition is met SCASB SCASW SCASD • Search for a specific element in an array • Search for the first element that does not match a given value

  42. x86 SCAS .data alpha BYTE "ABCDEFGH",0 .code movedi,OFFSET alpha moval,'F' ; search for 'F' movecx,LENGTHOF alpha cld repnescasb ; repeat while not equal jnz quit decedi ; EDI points to 'F'

  43. x86-64 Conditionals • Conditional instruction execution • cmovXXsrc, dest • Move value from src to dest if condition XX holds • No branching • Conditional handled as operation within Execution Unit • Added with P6 microarchitecture (PentiumPro onward) • Must ensure gcc compiles with proper target to use • Example (x < y) ? (x) : (y) • Performance • 14 cycles on all data • More efficient than conditional branching (simple control flow) • But overhead: both branches are evaluated movl 8(%ebp),%edx # Get x movl 12(%ebp),%eax # rval=y cmpl %edx, %eax # rval:x cmovll%edx,%eax # If <, rval=x

  44. x86-64 conditional example int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; } # x in %edi, y in %esi absdiff: movl%edi, %eax # eax = x movl%esi, %edx # edx = y subl%esi, %eax # eax = x-y subl%edi, %edx # edx = y-x cmpl%esi, %edi # x:y cmovle%edx, %eax # eax=edx if <= ret

  45. IA32 function calls • Handled based on calling convention used by the processor and compiler for each language • First, some data structures

  46. Increasing Addresses Stack Pointer %esp IA32 Stack Stack “Bottom” • Region of memory managed with stack discipline • Grows toward lower addresses • Register %esp indicates lowest stack address • address of top element Stack Grows Down Stack “Top”

  47. Increasing Addresses Stack Pointer %esp IA32 Stack Pushing Stack “Bottom” • Pushing • pushlSrc • Decrement %esp by 4 • Fetch operand at Src • Write operand at address given by %esp • e.g. pushl %eax subl $4, %esp movl %eax,(%esp)‏ Stack Grows Down -4 Stack “Top”

  48. Increasing Addresses Stack Pointer %esp IA32 Stack Popping Stack “Bottom” • Popping • poplDest • Read operand at address given by %esp • Write to Dest • Increment %esp by 4 • e.g. popl %eax movl (%esp),%eax addl $4,%esp Stack Grows Down +4 Stack “Top”

  49. Stack Operation Examples Initially pushl %eax popl %edx 0x110 0x110 0x110 0x10c 0x10c 0x10c 0x108 123 0x108 123 0x108 123 0x104 213 0x104 213 Top Top Top %eax 213 %eax 213 %eax 213 %edx %edx %edx 555 213 %esp 0x108 %esp 0x104 0x108 %esp 0x104 0x108

  50. Procedure Control Flow • Procedure call: • call label • Push address of next instruction (after the call) on stack • Jump to label • Procedure return: • ret Pop address from stack into eip register

More Related