280 likes | 441 Views
Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 5: Languages and the Machine. Topics. 5.2 The Assembly Process 5.3 Linking and Loading 5.4 Macros. Assembler. Assembly lan. progs. Machine code. Prog1 (MC). Linker. exe file. Loader. Memory. prog2 (MC).
E N D
Principles of Computer ArchitectureMiles Murdocca and Vincent HeuringChapter 5: Languages and the Machine
Topics 5.2 The Assembly Process 5.3 Linking and Loading 5.4 Macros Assembler Assembly lan. progs Machine code Prog1(MC) Linker exe file Loader Memory prog2(MC) … prog n(MC) libs
The Assembly Process • The process of translating an assembly language program into a machine language program is referred to as the assembly process. • Assemblers generally provide this support: — Allow programmer to specify locations of data and code. — Provide assembly-language mnemonics for all machine instructions and addressing modes, and translate valid assembly language statements into the equivalent machine language. — Permit symbolic labels to represent addresses and constants. — Provide a means for the programmer to specify the starting address of the program, if there is one; and provide a degree of assemble-time arithmetic. — Include a mechanism that allows variables to be defined in one assembly language program and used in another, separately assembled program. — Support macro expansion.
Assembly Example • We explore how the assembly process proceeds by “hand assembling” a simple ARC assembly language program.
Macro expansion for PUSH MACRO definition … PUSH %r2 PUSH %r3 PUSH %r1
Assembled Code ld [x], %r1 1100 0010 0000 0000 0010 1000 0001 0100 ld [y], %r2 1100 0100 0000 0000 0010 1000 0001 1000 addcc %r1,%r2,%r3 1000 0110 1000 0000 0100 0000 0000 0010 st %r3, [z] 1100 0110 0010 0000 0010 1000 0001 1100 jmpl %r15+4, %r0 1000 0001 1100 0011 1110 0000 0000 0100 15 0000 0000 0000 0000 0000 0000 0000 1111 9 0000 0000 0000 0000 0000 0000 0000 1001 0 0000 0000 0000 0000 0000 0000 0000 0000
2-PASS Assembler • Goes over the program twice • 1st pass: • Find out addresses for all DATA items and machine instructions • Employing location counter and forward referencing • Keep track of the addresses of the current instruction or data items as assembly proceeds • Create a symbolic table • Translate each assembly instruction into machine instruction • Mnemonics • Addressing modes • Has not generated machine code yet. • Location counter (similar to PC) • Initialized to what .org specifies/zero • Incremented by the size of each instruction
Address for each instruction 2048 2052 2056 2060 2064 2068 2072 2076
Symbol table • Contains all labels and constant values in the assembly program • After the first pass, the assembly will have identified and entered all symbols into the symbolic table • during the second pass, the assembler generates the machine code, inserting the values of symbols, which are known
Symbol table • Symbol table for example 1: main 2048 x 2068 y 2072 z 2076
Symbol table Symbol value a_start 3000 length -- address -- loop 2060 done 2088
Symbol table cont’d Symbol value a_start 3000 length 2092 (1) address 2096 (2) loop 2060 done 2088
Second pass • After the symbol table is created • the program is read a second time • Start from .begin • Machine code is generated using symbol table
Final tasks of the assembler • Extra info to be added to the program for the linker and loader • Linker: linking modules together • Loader: putting the program into memory • Module name and size • Memory and segment info, such as code, data, stack • The address of start symbol (if defined) • The info about global and external symbols • The address of any global symbols • Info about any library routines that are referenced by the module • The value of any constants to be loaded to memory • Some loaders expect data initialization to be specified separately from the binary code • Relocation info
Relocation info • Linker will link all the modules by concatenating them • To do so some of the modules need to be relocated • Example:
Location of programs in memory • As long as program is in user-accessible segment • Might not in the address indicated by .org • Modules are concatenated one after the other • After linking, some modules are relocated to other addresses • Most of addresses we use are re-locatable • Exception: • fixed addresses (I/O devices)
Linking and loading • A number of separately compiled/assembled modules • Linker: a software program • Combines separately assembled programs (object modules) into a single program (load module/exe) • Loader: a software program that places the load module into main memory • Load the various memory segments with the appropriate values and initialize certain registers (i.e. %sp, %pc)
linking • Resolve address refs that are external to modules as it links them • Relocate each module in order to combine them • Specify the starting symbol of the load module • Identify different memory segments • Needs to know local symbol names from global symbol names • Only address labels can be global or external • Some addresses can not be relocated • External symbols defined in another module • Absolute numbers (defined by pseudo-code) • ONE .equ 1 • a_start .equ 3000 • Content of the address when relocated, do not change • i.e : x :105 • !relocate x, address changes, but content remains the same
Linking: Using .global and .extern • A .global is used in the module where a symbol is defined and a .extern is used in every other module that refers to it.
Linking and Loading: Symbol Tables • Symbol tables for the previous example:
loading • Relocating loader • More than one modules, relocate by adding an offset to all re-locatable code in that module • Linking loader • Does both linking and loading • header info • An exe file contains a header (inserted by linker) • Where to load the program • Starting address • Relocation info • What routines are available externally • Dll (microsoft)