200 likes | 362 Views
Chapter 8. ICS 412. Code Generation. Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate some form of assembly code that must be processed further by an assembler, a linker, and a loader. Intermediate Code.
E N D
Chapter 8 ICS 412
Code Generation • Final phase of a compiler construction. • It generates executable code for a target machine. • A compiler may instead generate some form of assembly code that must be processed further by an assembler, a linker, and a loader.
Intermediate Code • An intermediate representation that looks like target code is called intermediate code. • Intermediate code can take many forms. • Two popular forms are: • Three-Address code • P-Code
Intermediate Code • Intermediate code is particularly useful when the goal of the compiler is to produce efficient code. • The analysis of the properties of the target code can be easily generated from intermediate code. • Intermediate code can also be useful in making the compiler target machine independent. • To generate code for a different target machine we only need to write a translator from the intermediate code to the target code.
Three-Address Code • The three address code has the following general form: x = y op z • The name "three-address code" comes from this form of instruction. • x, y, and z represents an address in memory.
Example • Consider the arithmetic expression: 2 * a + ( b – 3 ) • The corresponding three-address code is: t1 = 2 * a t2 = b – 3 t3 = t1 + t2
Three-Address Code • t1, t2, and t3aretemporaries correspond to the interior nodes of the syntax tree and represent their computed values, with the final temporary (t3, in this example) representing the value of the root. + * - 2 a b 3
Three-Address Code • The above three-address code represents a left-to-right linearization of the syntax tree. • Another order is possible for this three-address code, namely (with a different meaning for the temporaries), t1 = b - 3 t2 = 2 * a t3 = t2 + t1
Three-Address Code • One form of three-address code is insufficient to represent all language features. • For instance, unary operators uses a two-addresses variation of the three-address code t2 = - t1 • It is necessary to vary the form of three-address code to represent all the programming constructs.
read x; if 0 < x then fact := 1; repeat fact := fact * x; x := x – 1 until x = 0; write fact end read x t1 = x > 0 if_false t1 goto L1 fact = 1 label L2 t2 = fact * x fact = t2 t3 = x – 1 x = t3 t4 = x == 0 if_false t4 goto L2 write fact label L1 halt Factorial Example
Factorial Example • This code contains a number of different forms of three-address code: • Built-in input/output operations read and write have been translated directly into one-address instructions. • Conditional jump instruction if_false that is used to translate both if-stmt and repeat-stmt ant that contains two addresses. • One address label instruction used to represent the position of the jump. • Halt instruction to represent the end of the code.
P-Code • P-code began as a standard target assembly code produced by a number of Pascal compilers of the 1970s and early 1980s. • It was designed to be the actual code for a hypothetical stack machine, called the P-machine, for which an interpreter was written on various actual machines. • The idea was to make Pascal compilers easily portable by requiring only that the P-machine interpreter be rewritten for a new platform.
P-Code • The P-machine consists of: • A code memory. • An unspecified data memory for named variables. • A stack for temporary data, together with whatever registers are needed to maintain the stack and support.
Example 1 • Consider the expression: 2*a+(b-3) • P-code for this expression is as follows: ldc 2 ; load constant 2 lod a ; load value of variable a mpi ; integer multiplication lod b ; load value of variable b ldc 3 ; load constant 3 sbi ; integer subtraction adi ; integer addition
Example 1 • These instructions are to be viewed as representing the following P-machine operations: • ldc 2 pushes the value 2 onto the temporary stack. • lod a pushes the value of the variable a onto the stack. • mpi pops these two values from the stack, multiplies them and pushes the result onto the stack. • lod b and ldc 3 push the value of b and the constant 3 onto the stack (there are now three values on the stack). • sbi pops the top two values from the stack, subtracts them, and pushes the result. • adi pops the remaining two values from the stack, adds them, and pushes the result. • The code ends with a single value on the stack, representing the result of the computation.
Example 2 • Consider the assignment statement: x = y +1 • The corresponding P-Code is: lda x ; load address of x lod y ; load value of y ldc 1 ; load constant 1 adi ; add sto ; store top to address ; below top & pop both.
Example 3 read x; if 0 < x then fact := 1; repeat fact := fact * x; x := x – 1 until x = 0; write fact end
P-Code lda x ; load address of x rdi ; read an integer store to address on top of stack (& pop it) lod x ; load the value of x ldc 0 ; load constant 0 grt ; pop and compare top two values, push Boolean result fjp Ll ; pop Boolean value, jump to Ll if false lda fact ; load address of fact ldc 1 ; load constant 1 sto ; pop two values, storing first to address of second lab L2 ; definition of label L2 lda fact ; load address of fact lod fact ; load value of fact lod x ; load value of x mpi ; multiply
P-Code sto ; store top to address of second & pop lda x ; load address of x lod x ; load value of x ldc 1 ; load constant 1 sbi ; subtract sto ; store (as before) lod x ; load value of x ldc 0 ; load constant 0 equ ; test for equality fjp L2 ; jump to L2 if false lod fact ; load value of fact wri ; write top of stack & pop lab Ll ; definition of label Ll stp
P-Code vs. Three-Address Code • P-code • Closer to machine code. • Fewer addresses (1 or 0). • Stack automatically handles temps, so compiler does not need to generate name/locations. • Three-Address • fewer instructions • More complex instructions, so less code to generate.