330 likes | 541 Views
Computer Architecture & Organization. Assemblers and Compilers Engr. Umbreen Sabir Computer Engineering Department, University of Engg. & Technology Taxila. Path from Programs to Bits. C OMPILERS.
E N D
Computer Architecture & Organization CA&O Lecture 7 by Engr. Umbreen Sabir Assemblers and Compilers Engr. Umbreen Sabir Computer Engineering Department, University of Engg. & Technology Taxila.
Path from Programs to Bits CA&O Lecture 7 by Engr. Umbreen Sabir
COMPILERS • The compiler transforms the C program into an assembly language program, a symbolic form of what the machine understands. • Higher level language programs take many fewer lines of code than assembly language. • So, programmer productivity is higher. CA&O Lecture 7 by Engr. Umbreen Sabir
How an Assembler Works • Three major components of assembly 1) Allocating and initialing data storage 2) Conversion of mnemonics to binary instructions 3) Resolving addresses • Pseudoinstructions: • Assembler can also treat common variations of machine level instructions as if they were instructions in their own right. Such instructions are called Pseudoinstructions. e.g. Move $t0, $t1 #register $t0 gets register $t1. • Assembler converts this assembly language instruction into the machine language equivalent of the following instruction: add $t0, $zero, $t1 • It converts bltinto two instructions sltand bne. • MIPS assembler also allows 32-bit constants to be loaded into a register despite the 16-bit limit of immediate instructions. CA&O Lecture 7 by Engr. Umbreen Sabir
How an Assembler Works Cont. • The assembler turn the assembly language program into an object file, which is a combination of machine language instructions, data and information needed to place instructions in memory. • Assembler keeps track of labels used in branches and data transfer instructions in a symbol table. CA&O Lecture 7 by Engr. Umbreen Sabir
Object File Components • Object file typically contains six distinct pieces. • Object File Header: describes size and position of other pieces. • Text Segment: Contains machine code. • Data segment: static or dynamic data. • Relocation information: identifies instructions and data words that depend on absolute addresses when the program is loaded in memory. • Symbol Table: contains the remaining labels such as external references. • Debugging Information CA&O Lecture 7 by Engr. Umbreen Sabir
Example CA&O Lecture 7 by Engr. Umbreen Sabir
Resolving Addresses- 1st Pass“Old-style” 2-pass assembler approach CA&O Lecture 7 by Engr. Umbreen Sabir
Resolving Addresses – 2nd Pass CA&O Lecture 7 by Engr. Umbreen Sabir
Modern Way – 1-Pass Assemblers • Modern assemblers keep more information in their symbol table which allows them to resolve addresses in a single pass. • Known addresses (backward references) are immediately resolved. • Unknown addresses (forward references) are “back-filled” once they are resolved. CA&O Lecture 7 by Engr. Umbreen Sabir
The Role of a Linker • Some aspects of address resolution cannot be handled by the assembler alone. 1) References to data or routines in other object modules 2)The layout of all segments in memory 3) Support for REUSABLE code modules 4) Support for RELOCATABLE code modules • This final step of resolution is the job of a LINKER CA&O Lecture 7 by Engr. Umbreen Sabir
Linker • Also called link editor. • A system program that combines independently assembled machine language programs and resolves all undefined labels into an executable file. • Three steps of linker • Place code and data modules symbolically in memory. • Determine the addresses of data and instruction labels. • Place both the internal and external references. • The linker uses the relocation information and symbol table in each object module to resolve all undefined labels. • The linker produces an executable file that can be run on any computer. This file has same format as object file, except that it contains no unresolved references. CA&O Lecture 7 by Engr. Umbreen Sabir
Loader • A system program that places an object program in main memory so that it is ready to execute. • It follows following steps • Reads the exe file header to determine size of the text and data segments. • Creates an address space large enough for text and data. • Copies the instructions and data from the exe to memory. • Copies the parameters to the main program onto the stack. • Initializes the machine registers and sets the stack pointer to the first free location. • Jumps to a start-up routine that copies the parameters into the argument registers and calls the main routine of the program. Terminates program by exit system call. CA&O Lecture 7 by Engr. Umbreen Sabir
Static and Dynamic Libraries • LIBRARIES are commonly used routines stored as a concatenation of “Object files”. A global symbol table is maintained for the entire library with entry points for each routine. • When routines in LIBRARIES are referenced by assembly modules, the routine’s entry points are resolved by the LINKER, and the appropriate code is added to the executable. This sort of linking is called STATIC linking. • Many programs use common libraries. It is wasteful of both memory and disk space to include the same code in multiple executables. The modern alternative to STATIC linking is to allow the LOADER and THE PROGRAM ITSELF to resolve the addresses of libraries routines. This form of linking is called DYNAMIC linking (e.x. .dll). CA&O Lecture 7 by Engr. Umbreen Sabir
Dynamically Linked Libraries • Library routines are not linked and loaded until the program is run. • Program and library routines keep extra information on the location of nonlocal procedures and their names. • In Initial version, loader ran a dynamic linker to linker, using extra information in file to find appropriate libraries but it still linked all routines of the library. • Then came lazy procedure linker version of DLLs, where each routine is linked after it is called. CA&O Lecture 7 by Engr. Umbreen Sabir
Dynamically Linked Libraries CA&O Lecture 7 by Engr. Umbreen Sabir
Dynamically Linked Libraries CA&O Lecture 7 by Engr. Umbreen Sabir
Dynamically Linked Libraries CA&O Lecture 7 by Engr. Umbreen Sabir
Modern Languages • Intermediate “object code language” • Rather than compiled to assembly language, java is compiled to instructions that are easy to interpret, the Java Byte Code, • A software interpreter, Java Virtual Machine (JVM) can execute java bytecodes. • An interpreter is a program that simulates an instruction set architecture. E.g. MIPS simulator is an interpreter. • Upside of interpreter is portability. • Downside of interpreter is low performance. • To preserve portability and improve performance, Just In Time (JIT) compilers were introduced, which translate while the program is running. CA&O Lecture 7 by Engr. Umbreen Sabir
Modern Languages • Intermediate “object code language” CA&O Lecture 7 by Engr. Umbreen Sabir
Modern Languages • Intermediate “object code language” CA&O Lecture 7 by Engr. Umbreen Sabir
Compiler Optimizations • High Level Optimization • Transformations that are done at something close to the source level. • Local & Global Optimization • Local Optimization • Works within a single basic block. • Global Optimization • Works across multiple basic blocks. • Global Register Allocation • Allocates variables to registers for regions of the code. CA&O Lecture 7 by Engr. Umbreen Sabir
Compiler Optimizations • Common Subexpression Elimination: • Finds multiple instances of the same expression and replaces the second one by a reference to the first. • Strength Reduction: • Replaces complex operations by simpler ones. • Constant Propagation/Constant Folding: • Find constants in code and propagates them, collapsing constant values whenever possible. • Copy Propagations: • Propagates values that are simple copies, eliminating the need to reload values. • Dead Store Elimination: • Finds stores to values that are not used again and eliminates the stores. CA&O Lecture 7 by Engr. Umbreen Sabir
Compiler Optimizations • Global Code Optimization • Code motion • Finds code that is loop invariant: a particular piece of code computes the same value on every loop iteration and may be computed once outside the loop. • Induction variable elimination • Combination of transformations that reduce overhead on indexing arrays, essentially replacing array indexing with pointer access. CA&O Lecture 7 by Engr. Umbreen Sabir
Compiler Optimizations CA&O Lecture 7 by Engr. Umbreen Sabir
Compiler Optimizations • Example “C” Code: CA&O Lecture 7 by Engr. Umbreen Sabir
Unoptimized Assembly Output • With debug flags set: CA&O Lecture 7 by Engr. Umbreen Sabir
Register Allocation • Assign local variables to registers CA&O Lecture 7 by Engr. Umbreen Sabir
Loop-Invariant Code Motion • Assign globals to temp registers and moves assignments outside of loop CA&O Lecture 7 by Engr. Umbreen Sabir
Remove Unnecessary Tests • Since “i” is initially set to “0”, we already know it is less than “10”, so why test it the first time through? CA&O Lecture 7 by Engr. Umbreen Sabir
Remove Unnecessary Stores • All we care about it the value of total after the loop, and simplify loop CA&O Lecture 7 by Engr. Umbreen Sabir
Unrolling Loop • Two copies of the inner loop reduce the branching overhead CA&O Lecture 7 by Engr. Umbreen Sabir