1 / 42

Assemblers and Linkers

Assemblers and Linkers. Professor Jennifer Rexford COS 217. Goals of This Lecture. Compilation process Compile, assemble, archive, link, execute Assembling Representing instructions Prefix, opcode, addressing modes, operands Translating labels into memory addresses

shiloh
Download Presentation

Assemblers and Linkers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assemblers and Linkers Professor Jennifer Rexford COS 217

  2. Goals of This Lecture • Compilation process • Compile, assemble, archive, link, execute • Assembling • Representing instructions • Prefix, opcode, addressing modes, operands • Translating labels into memory addresses • Symbol table, and filling in local addresses • Connecting symbolic references with definitions • Relocation records • Specifying the regions of memory • Generating sections (data, BSS, text, etc.) • Linking • Concatenating object files • Patching references

  3. Compilation Pipeline .c • Compiler (gcc): .c  .s • High-level to assembly language • Assembler (as): .s  .o • Assembly to machine language • Archiver (ar): .o  .a • Object files into single library • Linker (ld): .o + .a  a.out • Builds an executable file • Execution (execlp) • Loads executable and starts it Compiler .s Assembler .o Archiver .a Linker/Loader a.out Execution

  4. Assembler • Assembly language • A symbolic representation of machine instructions • Machine language • Contains everything needed to link, load, and execute the program • Assembler • Translates assembly language into machine language • Translate instruction mnemonics into op-codes • Translate symbolic names for memory locations • Stores the result in an object file (.o)

  5. General IA32 Instruction Format • Prefixes: we won’t worry about these for now • Opcode • ModR/M and SIB (scale-index-base): for memory operands • Displacement and immediate: depending on opcode, ModR/M and SIB • Note: byte order is little-endian (low-order byte of word at lower addresses) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base

  6. Example: Push on to Stack • Assembly language:pushl %edx • Machine code: • IA32 has a separate opcode for push for each register operand • 50: pushl %eax • 51: pushl %ecx • 52: pushl %edx • … • Results in a one-byte instruction • Observe: sometimes one assembly language instruction can map to a group of different opcodes 0101 0010

  7. Example: Load Effective Address • Assembly language:leal (%eax,%eax,4), %eax • Machine code: • Byte 1: 8D (opcode for “load effective address”) • Byte 2: 04 (dest %eax, with scale-index-base) • Byte 3: 80 (scale=4, index=%eax, base=%eax) 1000 1101 0000 0100 1000 0000 Load the address %eax + 4 * %eax into register %eax

  8. 4 4 4 4 8 2 2 3 3 3 3 1000 1000 1000 1000 00001100 01 11 011 011 001 001 Mod=01 table [EAX]+disp8 0 [ECX]+disp8 1 [EDX]+disp8 2 [EBX]+disp8 3 [--][--]+disp8 4 [EBP]+disp8 5 [ESI]+disp8 6 [EDI]+disp8 7 movl 12(%ecx), %ebx 12 ebx (ecx) modedisp8(%_), %_ Example: Movl (Opcode 44) ModR/M SIB Mod=11 table EAX 0 ECX 1 EDX 2 EBX 3 ESP 4 EBP 5 ESI 6 EDI 7 Instruction prefixes Opcode Displacement Immediate M o d reg R/M B I S movl %ecx, %ebx ebx ecx mov r/m32,r32 mode%_, %_ Reference: IA-32 Intel Architecture Software Developer’s Manual, volume 2, page 2-1, page 2-6, and page 3-441

  9. 4 4 8 32 2 3 3 01100001 00000000000000000000001111100111 1100 0110 00 000 101 Example: Mov Immediate to Memory ModR/M SIB Instruction prefixes Opcode Displacement Immediate M o d reg R/M S I B mov r/m8,imm8 movb $97, 999 999 97 modedisp32 Mod=00 table [EAX] 0 [ECX] 1 [EDX] 2 [EBX] 3 [--][--] 4 disp32 5 [ESI] 6 [EDI] 7

  10. 4 4 8 32 2 3 3 01100001 00000000000000000000001111100111 1100 0110 00 000 101 Encoding as Byte String ModR/M SIB Instruction prefixes Opcode Displacement Immediate M o d reg R/M S I B movb $97, 999 C6 05 E7 03 00 0061 little-endian

  11. 4 4 8 32 2 3 3 00000000000000000000001111100111 01100001 1100 0110 00 000 101 Assembly Language movb $97, 999 C6 05 E7 03 00 00 61 char grade = 67; … grade = ‘a’; .globl grade .data grade: .byte 67 .text . . . movb $97, grade . . . located at address 999

  12. .text ... movl count, %eax ... .data count: .word 0 ... .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmp loop done: Symbol Manipulation Create labels and remember their addresses Deal with the “forward reference problem”

  13. Dealing with Forward References • Most assemblers have two passes • Pass 1: symbol definition • Pass 2: instruction assembly • Or, alternatively, • Pass 1: instruction assembly • Pass 2: patch the cross-reference

  14. symbol table instruction data section instruction text section instruction bss section instruction Implementing an Assembler .s file .o file input assemble output disk in memory structure in memory structure disk

  15. symbol table instruction data section instruction text section instruction bss section instruction Input Functions • Read assembly language and produce list of instructions .s file .o file input assemble output

  16. Input Functions • Lexical analyzer • Group a stream of characters into tokens add %g1 , 10 , %g2 • Syntactic analyzer • Check the syntax of the program <MNEMONIC><REG><COMMA><REG><COMMA><REG> • Instruction list producer • Produce an in-memory list of instruction data structures instruction instruction

  17. ... loop: cmpl%edx, %eax jge done pushl%edx call foo jmp loop done: D 0 3 9 Instruction Assembly [0] 1 byte [2] disp? 7 D 4 bytes [4] 5 2 [5] disp? E 8 [10] disp? E 9 [15] How to compute the address displacements?

  18. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmp loop done: type def loop label 0 address D 0 3 9 Symbol Table loop done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] disp_s 7 D [4] 5 2 [5] disp_l E 8 [10] disp_l E 9 [15]

  19. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type def loop label 0 address D 0 3 9 Symbol Table loop done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] disp_s 7 D [4] 5 2 [5] disp_l E 8 [10] disp_l E 9 [15]

  20. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type def loop label 0 address D 0 3 9 Symbol Table loop done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] disp_s 7 D +13 [4] 5 2 [5] disp_l E 8 [10] disp_l E 9 [15]

  21. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type def loop label 0 address D 0 3 9 Symbol Table loop done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] disp_s 7 D +13 [4] 5 2 [5] disp_l E 8 [10] disp_l E 9 [15]

  22. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type label address D 0 3 9 Filling in Local Addresses loop def loop 0 done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] +13 7 D [4] 5 2 [5] disp_l E 8 [10] -10 E 9 [15]

  23. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type label address D 0 3 9 Filling in Local Addresses loop def loop 0 done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] +13 7 D [4] 5 2 [5] disp_l E 8 [10] -10 E 9 [15]

  24. .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: type label address D 0 3 9 Filling in Local Addresses loop def loop 0 done disp_s done 2 disp_l foo 5 disp_l loop 10 def done 15 [0] [2] +13 7 D [4] 5 2 [5] disp_l E 8 [10] -10 E 9 [15]

  25. ... .globl loop loop: cmpl %edx, %eax jge done pushl %edx call foo jmploop done: D 0 3 9 Relocation Records def loop 0 disp_l foo 5 [0] +13 7 D [2] 5 2 [4] disp_l E 8 [5] -10 E 9 [10] [15]

  26. Assembler Directives • Delineate segments • .section • Allocate/initialize data and bss segments • .word .half .byte • .ascii .asciz • .align .skip • Make symbols in text externally visible • .global

  27. symbol table instruction data section instruction text section instruction bss section instruction Assemble into Sections • Process instructions and directives to produce object file output structures .s file .o file input assemble output

  28. symbol table instruction data section instruction text section instruction bss section instruction Output Functions • Machine language output • Write symbol table and sections into object file .s file .o file input assemble output

  29. optional for .o files ELF Header Program Hdr Table Section 1 . . . optional for a.out files Section n Section Hdr Table ELF: Executable and Linking Format • Format of .o and a.out files • Output by the assembler • Input and output of linker

  30. output (also in “.o” format, but no undefined symbols) library (contains more .o files) Invoking the Linker • ld bar.o main.o –l libc.a –o a.out compiled program modules Invoked automatically by gcc, but you can call it directly if you like.

  31. D 0 3 9 Multiple Object Files main.o bar.o start main 0 def foo 8 def loop 0 disp_l loop 15 disp_l foo 5 [0] [0] [2] [2] +13 7 D [4] [4] 5 2 [7] [5] disp_l E 8 [8] [10] -10 E 9 [12] [15] [15] disp_l [20]

  32. D 0 3 9 Step 1: Pick An Order main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] disp_l E 8 [23] [10] -10 E 9 [27] [15] [30] disp_l [35]

  33. D 0 3 9 Step 1: Pick An Order main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] disp_l E 8 [23] [10] -10 E 9 [27] [15] [30] disp_l [35]

  34. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 disp_l [35]

  35. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 disp_l [35]

  36. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 disp_l [35]

  37. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 disp_l [35]

  38. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 [35]

  39. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 [35]

  40. D 0 3 9 Step 2: Patch main.o bar.o start main 15+0 def foo 15+8 def loop 0 disp_l loop 15+15 disp_l foo 5 [15] [0] [17] [2] +13 7 D [19] [4] 5 2 [22] [5] 15+8-5=18 E 8 [23] [10] -10 E 9 [27] [15] [30] 0-(15+15)=-30 [35]

  41. Step 3: Concatenate a.out start main 15+0 [0] D 0 3 9 [2] +13 7 D [4] 5 2 [5] +18 E 8 [10] -10 E 9 [15] [17] [19] [22] [23] [27] [30] -30 [35]

  42. Summary • Assember • Read assembly language • Two-pass execution (resolve symbols) • Produce object file • Linker • Order object codes • Patch and resolve displacements • Produce executable

More Related