8 Intermediate code generation

8 Intermediate code generation Zhang Zhizheng seu_zzz@seu.edu.cn

8.0 Overview 1.Position of Intermediate code generator Intermediate code generator static checker Syntax tree Syntax tree Token stream Intermediate code parser Code generator

2.Benefits for using a machine-independent intermediate form • Retargetingis facilitated; a compiler for a different machine can be created by attaching a back end for the new machine to an existing front end. • A machine-independentcode optimizercan be applied to the intermediate representation.

3.Implementation of Intermediate code generator • Syntax-directed translation, folded intoparsing • Top-down parsing • Bottom-up parsing

8. 1 Intermediate languages 1.Intermediate representations • Syntax tree • Syntax Tree • Directed acyclic graph(DAG) • Postfix notation • Three-address code • Quadruple

A syntax tree and DAG for the assignment statement : a:=b*-c+b*-c assign assign a a + + * * * b uminus b uminus b uminus c c c

Production of Syntax Tree (and DAG) Production Semantic rules S id:=E id.value=E.value E E1 + E2 E.value=E1.value+E1.value E E1 * E2 E.value=E1.valueE1.value E  - E1 E.value=-E1.value E  ( E1) E.value=E1.value E  id E.value=id.lexival Semantic rules on value attribution of assignment statement

Production Semantic rules S id:=E S.nptr:=maknode(‘assign’, mkleaf(id, id.place), E. nptr) E E1 + E2 E.nptr= maknode(‘+’, E1.nptr, E1.nptr) E E1 * E2 E.nptr= maknode(‘*’, E1.nptr, E1.nptr) E  - E1 E.nptr= maknode(‘uminus’, E1.nptr) E  ( E1) E.nptr= E1.nptr E  id E.nptr= maknode(id, id.place) Semantic rules on producing syntax tree of assignment statement

Representations of the syntax tree____the data structure of graph • See Fig. 8.4

2.Three-address code(TAC) A sequence of statements of the general form x= yopz Here, x, y, z are names, constants, or compiler-generated temporaries; op stands for any operator

Notes: 1)There is only one operator on the right side of a statement 2) Three address code is a linearized representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph 3) Each three-address code statement contains three addresses, two for the operands and one for the result

t1:=-c t2:=b*t1 t3:=-c t4:=b*t3 t5:=t2+t4 a:=t5 Code for the syntax tree t1:=-c t2:=b*t1 t5:=t2+t2 a:=t5 Code for the DAG E.g, Tree address code corresponding to the above Tree and DAG

3. Types of TAC • x:=y op z //assignment statement, op is binary arithmetic or logical operation// • x:=op y //assignment statement, op is unary operation as minus, logical negative, conversion operator etc.// • x:=y //Copy assignment statement// • goto L //Unconditional jump// • If x relop y goto L //Conditional jump: if x stands in relation relop to y, then executes the statement with label L, else executed the following statement // • param x1 …… param xn call p,n return y //Call procedure P with n parameters (x1,……,xn)//

x=y[i] x[i]=y • x=&y //the value of x is the location of y// x=*y *x=y

4.Syntax-directed Translation into TAC

E.g,a:=b*-c+b*-c can be translated into • t1:=-c • t2:=b*t1 • t3:=-c • t4:=b*t3 • t5:=t2+t4 • a:=t5 How translate??

Production Semantic Rules Sid:=E S.code:=E.code||gen(id.place ‘:=’ E.place) E E1+E2 E.place:=newtemp(); E.code:=E1.code||E2.code|| gen(E.place,’:=’,E1.place ‘+’ E2.place) E E1*E2 E.place:=newtemp(); E.code:=E1.code||E2.code|| gen(E.place,’:=’,E1.place ‘*’ E2.place) E -E1 E.place:=newtemp(); E.code:=E1.code|| gen(E.place,’:=’, ‘uminus’ E2.place) E id E.place:=id.place E.code:=‘’ E.place,the name that will hold the value of E E.code, the sequence of three-address statements evaluating E.

Production Semantic Rules Swhile E do S1S.begin=newlabel(); S.after=newlabel(); S.code=gen(S.begin ‘:’)||E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ S.after) || S1.code || gen(‘goto’ S.begin) || gen(S.after ‘:’)

Production Semantic Rules Sif E then S1 S.after=newlabel(); S.code=E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ S.after) || S1.code || gen(S.after ‘:’)

Production Semantic Rules Sif E then S1 S.after=newlabel(); else S2 E.false=newlabel(); S.code=E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ E.false) || S1.code || gen(‘goto’ S.after) || gen(E.false ‘:’) || S2.code || gen(S.after ‘:’)

5.Addressing array elements 1)One-dimensional array Addr(A[i])=base+(i-low)*w=i*w+(base-low*w) Notes: 1)Here, we assume the width of each array element is w and the start address of the array block is base. 2)The array is defined as array[low..upper] of type 3)The sub-expression c=base-low*w can be evaluated when the declaration of the array is seen and we assume that c is saved in the symbol table entry for the array.

2)two-dimensional array • row-major form Addr(A[i1, i2])=base+((i1-low1)*n2+i2-low2)*w =(i1*n2+i2)*w+base-(low1*n2+low2)*w Where n2=upper2-low2+1 t1=low1*n2 t2=t1+low2 t3=t2*w t4=base-t3 t5=i1*n2 t6=t5+i2 t7=t6*w t4[t7]=x x=t4[t7] (2) column-major form

3)n-dimensional array Array[l1:u1,, l2:u2,… ln:un] Let di=ui-li+1,i=1,2,…n, the width of each dimension is m D=a+((i1-l1)d2d3…dn+ (i2-l2)d3d4…dn + (in-1-ln-1)dn + (in-ln))m Change into D=conspart+varpart conspart=a-C C=((…(l1d2+l2 )d3+ l3) d3…+ ln-1) dn+ ln)m varpart= ((…(i1d2+i2 )d3+ i3) d3…+ in-1) dn+ in)m

6.Short-circuit code of Boolean expressions • Translate a boolean expression into intermediate code without evaluating the entire expression.

7. Translation methods of Flow of control statements in Short-circuit code 1)Associate E with two labels • E.true • The label to which control flows if E is true • E.false • The label to which control flows if E is false

2)Associate S with a label • S.next • Following S.code is a jump to some label

Production Semantic Rules Sif E then S1 E.true=newlabel(); E.false=S.next; S1.next=S.next; S.code=E.code ||gen(E.true ‘:’) ||S1.code Sif E then S1 else S2 E.true=newlabel(); E.false=newlabel(); S1.next=S.next S2.next=S.next S.code=E.code ||gen(E.true ‘:’) ||S1.code||gen(‘goto’ S.next)|| gen(E.false ‘:’)||S2.code

Production Semantic Rules Swhile E do S1 S.begin=newlabel(); E.true=newlabel(); E.false=S.next; S1.next=S.begin S.code=gen(S.begin ‘:’)||E.code ||gen(E.true ‘:’) ||S1.code||gen(‘goto’ S.begin)

Production Semantic Rules EE1 or E2 E1.true=E.true; E1.false=newlabel(); E2.true=E.true; E2.false=E.false E.code=E1.code ||gen(E1.false ‘:’) ||E2.code EE1 and E2 E1.true=newlabel(); E1.false=E.false; E2.true=E.true; E2.false=E.false E.code=E1.code ||gen(E1.true ‘:’) ||E2.code E id1 relop id2 E.code=gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true)||gen(‘goto’ E.false)

3)Examples (1)a<b or c<d and e<f if a<b goto Ltrue goto L1 L1:if c<d goto L2 goto Lfalse L2:if e<f goto Ltrue goto Lfalse Here, we assume that the true and false exits for the entire expression are Ltrue and Lfalse respectively

L1: if a<b goto L2 goto Lnext L2: if c<d goto L3 goto L4 L3:t1=y+z x=t1 goto L1 L4:t2=y-z x=t2 goto L1 Lnext: (2)while a<b do if c<d then x=y+z else x=y-z

8.Implementations of three-address statements • Quadruples • (op, arg1,arg2,result) • Triples • (n) (op,arg1,arg2) • (m) (op,(n),arg) Notes: A three-address statement is an abstract form of intermediate codes

9.Advantages of quadruples • Easy to generate target code • Good for optimizing

Exercises Please translate the following program fragment into three-address code using the form of short circuit code. i=2; loop=0; while (loop==0 && i<=10) { j=1; while (loop ==0 && j<i) if (a[i,j] == x) loop=1; else j=j+1; if (loop==0) i=i+1; } Notes: Here we assume that the declaration of array A is array [1..10,1..10], each data element of array A would use 2 storage units, and the start address of array A’s storage area is addrA.

Translate the following program fragment into three-address code. i=2; m=0; loop=0; while (loop==0 && i<=10) { j=1; while (loop ==0 && j<=i) if (a[i,j] != a[j,i]) //”!=” means “not equal to” { loop=1; m=1; } else j=j+1; if (loop==0) i=i+1; } Notes: Here we assume that the declaration of array A is array [1..10,1..10], each data element of array A would only use 1 storage unit, and the start address of array A’s storage area is addrA.

8. 2 Assignment statements 1、Assignment statements with only id 1) functions NEWTEMP() GEN(OP,ARG1,ARG2,RESULT) 2)Semantic rules for quadruple code generation

(1)A i=E {GEN(=, E•PLACE ,_, i.entry} (2)E -E (1) {T=NEWTEMP(); GEN(@, E(1)•PLACE ,_,T); E•PLACE =T} (3)E E (1)*E(2) {T=NEWTEMP(); GEN(*, E(1)•PLACE , E(2)•PLACE ,T); E•PLACE =T} (4)E E (1) + E(2) {T=NEWTEMP(); GEN(+, E(1)•PLACE , E(2)•PLACE ,T); E•PLACE =T} (5)E (E (1)) {E•PLACE =E(1)•PLACE} (6)E  i {E•PLACE = i.entry}

3.The translation scheme for addressing array elements 1) grammar AV:=E V i[Elist] | i Elist Elist,E | E E E op E | (E) | V

3.The translation scheme for addressing array elements 2) Rewriting of the grammar AV:=E V Elist] | i Elist Elist(1),E | i[ E E E op E | (E) | V Notes: This rewriting aims that the various dimensional limits nj of the array be available as we group index expressions into an Elist.

3.The translation scheme for addressing array elements 3) semantic variables ARRAY DIM PLACE OFFSET

3.The translation scheme for addressing array elements 4) Translation code (1)AV=E {if (V•OFFSET=null) GEN(=,E • PLACE,_,V•PLACE); else GEN([ ]=,E•PLACE,_,V•PLACE[V•OFFSET])}

(2)E E(1) op E (2) {T=NEWTEMP(); GEN(op, E(1)•PLACE, E(2)•PLACE,T); E • PLACE =T} (3)E (E (1)) {E • PLACE = E(1)•PLACE} (4)E  V {if (V•OFFSET=null) E • PLACE = V•PLACE; else {T=NEWTEMP(); GEN(=[ ], E • PLACE[V•OFFSET],_,T); E • PLACE =T;}}

(5)V Elist] {if (TYPE[ARRAY]<>1) {T=NEWTEMP(); GEN(*,Elist•PLACE,TYPE[ARRAY],T); Elist •PLACE=T;} V •OFFSET=Elist •PLACE; T=NEWTEMP(); GEN(-,HEAD[ARRAY],CONS[ARRAY],T); V •PLACE=T} (6)V i {V •PLACE=ENTRY[i]; V •OFFSET=null}

(7)Elist Elist(1),E {T=NEWTEMP(); k= Elist(1) •DIM+1; dk=LIMIT(Elist(1)•ARRAY,k); GEN(*,Elist (1)•PLACE, dk,T); T1=NEWTEMP(); GEN(+,T,E •PLACE, T1); Elist•ARRAY= Elist(1)•ARRAY; Elist•PLACE= T1; Elist•DIM=k;

(8)Elist  i[ E {Elist•PLACE=E•PLACE; Elist•DIM=1; Elist•ARRAY=ENTRY(i)}

E.g. Let A be an array:ARRAY[1:10,1:20]; the address of the beginning of the array is a, m=1. We can get C by the computing: (low1*n2+low2)*m=(1*20+1)*1=21 The quadruples for X=A[I,J] are: (1) (*,I,20,T1) (2) (+, T1,J, T2) (3) (-,a,21, T3) (4) (=[ ], T3[T2],_, T4) (5) (=, T4,_,X)

8. 3 Boolean expressions 1.Primary purposes of boolean expressions • Compute logical values • Used as conditional expressions in statements that alter the flow of control,such as if or while statements. 2.Grammar • E E and E | E or E | not E | (E) | i | Ea rop Ea

3.Numerical representation (1)EEa(1) rop Ea(2) {T=NEWTEMP(); GEN(rop, Ea(1)•PLACE , Ea(2)•PLACE ,T); E•PLACE =T} (2)E E (1) bop E(2) {T=NEWTEMP(); GEN(bop, E(1)•PLACE , E(2)•PLACE ,T); E•PLACE =T}

3.Numerical representation (3)E not E (1) {T=NEWTEMP; GEN(not, E(1)•PLACE , _ ,T);E•PLACE =T} (4)E (E (1)) {E•PLACE =E(1)•PLACE} (5)E  i {E•PLACE = ENTRY(i)}

8 Intermediate code generation

8 Intermediate code generation

Presentation Transcript

Intermediate Code Generation

Generation of Intermediate Code

Intermediate Code Generation

Lecture 8: Intermediate Code

Intermediate Code Generation

UNIT – 6 INTERMEDIATE-CODE GENERATION

Intermediate Code Generation

Intermediate Code Generation

Intermediate code generation

Intermediate Code Generation

Chapter 8 Intermediate code generation Section 0 Overview

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate code generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation