630 likes | 1.21k Views
Intermediate Code Generation. Recall. The objective of a compiler is to analyze a source program and produce target code Front end analyzes the source program and generates an intermediate code Back end takes the Intermediate code as input and generates the target code.
E N D
Recall • The objective of a compiler is to analyze a source program and produce target code • Front end analyzes the source program and generates an intermediate code • Back end takes the Intermediate code as input and generates the target code
Language Vs. Target machine Lexical analyzer Parser Static Checker Intermediate Code Generator • Front End • Language dependent • Machine Independent • Input : source code • Output: Intermediate Code Code Generator • Back End • Language independent • Machine dependent • Input: Intermediate Code • Output : Target code
Advantages of using Intermediate Code • Retargeting to a different machine • Optimization of the code at intermediate level Front end Intermediate Representation Target machine 2 Language 1 Back end Target machine 1 Language 2 Target machine 3 Language 3
What is an Intermediate code instruction? • Any representation of the small units of HLL instruction, which can easily be translated into target code • Example: Consider a HLL instruction x=a+b*c; • Considering the limitation of the target machine to execute one operation at a time, this must be broken down to instructions t=b*c; x=a+t;
Example target machine t=b*c; x=a+t; Intermediate code Load R0,b Load R1,c Mul R2,R1,R0 Store t,R2 Load R0,t Load R1,a Add R2,R0,R1 Store R2, x • Available instructions • Load /store • Add • Mul • Addressing modes • Name x refers to the location holding value of x • a(R) to fetch the contents(a+contents(R)) • Etc. [more during code generation] Target code • Can be optimized • In terms of cost of instructions • In terms of less no of registers used
What do you have in hand at this stage? • Recognized the symbols in the source code • Created the input token stream • Checked the syntax of the source code • Reported errors, with respect to the line no. • Created and populated the symbol table • Lexemes, tokens, type, width, offset, value(static) • Created the abstract syntax tree (AST)
Abstract syntax Tree • Abstract syntax Tree • Each leaf node has a pointer to its symbol table entry • Each non-leaf node has an operation associated with it • Purpose of the AST • To extract meaning associated with each operation • This information along with symbol table will help in generating the target code
Abstract syntax tree for x=a+b*c; t=b*c; x=a+t; Intermediate code = Does this AST give any information? x + • We need to associate semantic rules with each node • We need to associate appropriate attributes with each node • We need to traverse the AST for the purpose of generating information using the defined semantic rules a * b c
Abstract syntax tree for x=a+b*c; t=b*c; x=a+t; Intermediate code = Let GenerateCode(op,left,right) generate New temp = left.lexeme op right.lexeme x + • Question: How would you prefer to traverse so that the desired code is generated? • Preorder? • Inorder? • Postorder? a * b c
Example : Intermediate code generation using AST t=b*c; x=a+t; Intermediate code t1=b * c = a=b * c x + a * b c All names are just the pointers to the corresponding symbol table
Three address code • It is linearized representation of an AST • At most one operator on the right side of an instruction is permissible • Temporary names can be generated to store temporary results • At most three addresses (any combination of the following) are permitted • Names of variables (representing memory locations) • A constant • Names of temporaries (may be mapped to registers or to memory)
Symbolic three address instructions (Intermediate Language used in the text book) • x = y op z //op is binary operator • x = op y //op is unary operator • x = y //copy instruction • goto L //unconditional jump to L • if x goto L //conditional jump • if x relop y goto L //conditional jump • param x //parameters in function • call p,n //function p with n parameters • return y //y being the return value
Symbolic three address instructions (Intermediate Language used in the text book) • x=a[i] //indexed copy • b[i]= y //indexed copy • x=&y //address and pointer • x=*y //assignments • *x= y
Example : Create intermediate code for the following C like codex=a[i]+b[i]; t1= a[i] t2= b[i] x= t1+t2 The code generation will require exact offsets of the elements Equivalent target code may be as follows Load R0,i Mul R1,i,4 Load R2,a(R1) //contents(a+contents(R1)) Load R3,b(R1) Add R2,R2,R3
Syntax-Directed Translation for generating 3-address code. • Attributes • E.addr: the name that will hold the value of E • E.code: holds the three address code statements that evaluate E • Use functions • Newtemp() • gen()
Translation of Expressions • S id = E { S.code = E.code||gen(id.addr‘=’E.addr ‘;’) } • E E1+E2{E.addr= newtemp () E.code = E1.code || E2.code || || gen(E.addr‘=’E1.addr ‘+’ E2.addr) } • E E1*E2{E.addr= newtemp () E.code = E1.code || E2.code || || gen(E.addr‘=’E1.addr‘*’E2.addr) } • E -E1{E.addr= newtemp() E.code = E1.code || || gen(E.addr‘=’ ‘uminus’ E1.addr) } • E (E1){E.addr= E1.addr ; E.code = E1.code} • E id {E.addr= id.lexe me E.code = ‘ ’} NOTE: E.addr represents the name of the value holder e.g. a,b,c,t1,t2 etc..
Recall following Example : create Intermediate code generation using AST and SDT (previous slide) t=b*c; x=a+t; Intermediate code Rule:1 Class Assignment: Work out the SDT scheme Generate 3 address code = Rule:-- Rule:2 x + Rule:3 Rule:6 a * Rule:6 Rule:6 What are the attributes? Are they synthesized or inherited? What is the evaluation order? b c
Implementations of 3-address statements • Quadruples t1:=- c t2:=b * t1 t3:=- c t4:=b * t3 t5:=t2 + t4 a:=t5
Translation of control flow statements • Example statements • Sif (B) S • If ((x<y) || (x>30)) z=x+y; • Sif (B) S else S • If ((x<y) || (x>30)) z=x+y; else z=x-y;
Boolean grammar Parse the following input Boolean expression x<y || x>30 • B B || B • B B && B • B ~B • B E relop E • E id • E num B B || B E relop E E relop E id id num id
Statement grammar with control flow instructions S if (B) S S if (B) S else S S while(B) S S S S S id = E • B B || B • B B && B • B ~B • B E relop E • E id • E num
if x<y || x>30 z=x+y Think about the semantic rules that generate 3 address code for this HLL instruction S S if B id = E B || B E + E E relop E E relop E id id id num id id
Expected intermediate code for HLL instructionif x<y || x>30 z=x+y if x < y goto L1 if x > 30 goto L1 goto L2 z=x+y L1: L2:
Expected intermediate code for HLL instructionif x<y || x>30 z=x+y else z=x-y if x < y goto L1 if x > 30 goto L1 goto L2 z=x+y L1: z=x-y L2:
Expected intermediate code for HLL instructionif x<y || x>30 z=x+y if x < y goto L1 if x > 30 goto L1 goto L2 Analyse the rule B B1 || B2 z=x+y L1: L2: Associate a label say L1 when B1 is true Or when B2is true
Expected intermediate code for HLL instructionif x<y || x>30 z=x+y else z=x-y if x < y goto L1 if x > 30 goto L1 goto L2 B is a non terminal z=x+y L1: Associate attributes true and false such that B1.true = L1 or B2.true = L1 or B2.false = L2 B1.false = ????? think z=x-y L2:
Parse tree for if x<y || x>30 z=x+y S S if B id = E B || B E + E E relop E E relop E id id id num id id
Abstract Syntax tree for if x<y || x>30 z=x+y S = || id + relop relop id id id id num id
B E1relop E2 B.Code = E1.code || E2.code || gen(‘if’ E1.addr relop.lexeme E2.addr ‘goto’ B.true) || gen(‘goto’ B.false) Final code if x < y goto L1 gotoL2 S Input x < y = || id + E.code=‘ ‘ || ‘ ‘ || if x < y goto L1 relop relop || goto L2 id id E.addr=id.lexeme E.code=‘ ‘ E.addr=num.lexeme E.code=‘ ‘ id id num id
B E1relop E2 B.Code = E1.code || E2.code || gen(‘if’ E1.addr relop.lexeme E2.addr ‘goto’ B.true) || gen(‘goto’ B.false) Final code if x > 30 goto L1 goto L3 S Input x > 30 = || id + E.code=‘ ‘ || ‘ ‘ || if x > 30 goto L1 relop relop || goto L3 id id id E.addr=id.lexeme E.code=‘ ‘ E.addr=num.lexeme E.code=‘ ‘ id num id
B1.true = B.true B1.false = newlabel() B2.true = B.true B2.false= B.false B.code = B1.code || label( B1.false) || B2.code B B1 || B2 S B1.true = L1 (inherited by left child) B1.false = L2 (inherited by left child) Input x < y || x > 30 = || B2.true = L1 (inherited by right child) B2.false = L3 (to be used later) id + B.code= relop relop if x < y goto L1 gotoL2 if x > 30 goto L1 goto L3 L2: id id id id num id
B.true=newlabel() B.false=S1.next=S.next S.code=B.code || label(B.true) || S1.code B.true=newlabel() B.false=S1.next=S.next S.code=B.code || label(B.true) || S1.code S if (B) S2 Already seen (inherited from here) B.true=L1 B.false=L3 if x < y goto L1 goto L2 S Input if (x < y || x > 30) z=x+y if x > 30 goto L1 goto L3 L2: = || L1: z= x+y L3: Code after S1 id + relop relop if x < y goto L1 goto L2 if x > 30 goto L1 goto L2 L2: id id id id num id
Generated Intermediate code using semantic rule if x < y goto L1 goto L2 Input if (x < y || x > 30) z=x+y if x > 30 goto L1 goto L3 L2: L1: z= x+y L3: Code after S1
Generated Intermediate code using semantic rule if x < y goto L1 goto L2 Input if (x < y || x > 30) z=x+y + v if x > 30 goto L1 goto L3 L2: L1: z= x+y t1= x + y t2= t1 + v L3: Code after S1 z = t2 L3: Code after S1
Next class • Code Generation