320 likes | 715 Views
컴파일러 입문. 제 9 장 중 간 언어. Contents. Introduction Polish Notation Three Address Code Tree Structured Code A bstract M achine C ode Concluding Remarks. Lexical Analyzer. tokens. Syntax Analyzer. AST. Back-End. Semantic Analyzer. Intermediate Code Generator.
E N D
Compiler Lecture Note, Intermediate Language 컴파일러 입문 제 9 장 중 간 언어
Compiler Lecture Note, Intermediate Language Contents Introduction Polish Notation Three Address Code Tree Structured Code Abstract Machine Code Concluding Remarks
Lexical Analyzer tokens Syntax Analyzer AST Back-End Semantic Analyzer Intermediate Code Generator Code Optimizer IC Target Code Generator IL Compiler Lecture Note, Intermediate Language Introduction • Compiler Model Source Program Object Program Front-End Front-End - language dependant part Back-End - machine dependant part
Compiler Lecture Note, Intermediate Language • IL의 필요성 • Modular Construction • Automatic Construction • Easy Translation • Portability • Optimization • Bootstrapping • IL의 분류 • Polish Notation --- Postfix, IR • Three Address Code --- Quadruple, Triple, Indirect triple • Tree Structured Code --- PT, AST, TCOL • Abstract Machine Code --- P-code, EM-code, U-code, Bytecode
Compiler Lecture Note, Intermediate Language • Two level Code Generation • ILS • 소스로부터 자동화에 의해 얻을 수 있는 형태 • 소스 언어에 의존적이며 high level이다. • ILT • 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태 • 목적기계에 의존적이며 low level이다. • ILS to ILT • ILS에서 ILT로의 번역이 주된 작업임. Source Front-End ILS ILS-ILT ILT Back-End Target
Compiler Lecture Note, Intermediate Language Polish Notation ☞ Polish mathematician Lucasiewiezinvented the parenthesis-free notation. • Postfix(Suffix) Polish Notation • earliest IL • popular for interpreted language - SNOBOL, BASIC • general form : e1 e2 ... ekOP (k ≥ 1) where, OP : k_ary operator ei : any postfix expression (1 ≤ i ≤ k)
Compiler Lecture Note, Intermediate Language • example : if a then if c-d then a+c else a*c else a+b 〓〉a L1 BZ c d - L2 BZ a c + L3 BR L2: a c * L3 BR L1: a b + L3: • note 1) high level: source to IL - fast & easy translation IL to target - difficulty 2) easy evaluation - operand stack 3) optimization 부적당- 다른 IL로의 translation 필요 4) parentheses free notation - arithmetic expression • interpretive language에 적합 Source Translator Postfix Evaluator Result
Compiler Lecture Note, Intermediate Language • Internal Representation(IR) • low-level prefix polish notation - addressing structure of target machine • compiler-compiler IL - table driven code generation • IR program - a sequence of root-level IR expression • IR expression: OP e1 e2 ... ... ek (k ≥ 1)where, OP: k-ary operator - 1-1 correspondence with target machine instruction. ┌─ root-level operator - not appear in an operand │ ⇒ root-level IR expression. └─ internal operator - appear in an operand ⇒ internal IR expression. ei : operand --- single symbol or internal IR expression.
Compiler Lecture Note, Intermediate Language • example D := E ⇔ := + d r ↑ + e r where, r : local base register d, e : location of variable D and E + : additive operator ↑ : unary operator giving the value of the location := : assignment operator(root-level) • example FOR D := E TO F DO Loop body; D := E; TEMP := F; GOTO 2 1: Loop body D := D + 1; 2: IF D <= TEMP THEN GOTO 1; := + d r ↑+ e r := + temp r ↑+ f r j L2 :L1 Loop body := + d r + ↑+ d r 1 :L2 <= L1 ? ↑+ d r ↑+ temp r
Compiler Lecture Note, Intermediate Language • Note 1) Shift-reduce parser --- prefix : fewer states than postfix 2) Several addressing mode ┌─ prefix : operator만 보고 결정(no backup) └─ postfix : backup 필요 ex) assumption: first operand computed in register r. r.1 ::= (/ d. 1 r. 2) r.1 ::= (+ r. 1 r. 2) ┌ prefix - [r -> / . d r] │ first operand changed to d and continue └ postfix - [r -> . d r /] [r -> . r r +] shift r, shift r and block([r -> r r . +]) ⇒ backup 3) Easy translation IR to target - easy source to IR - difficulty
Compiler Lecture Note, Intermediate Language Three Address Code • most popular IL, optimizing compiler • General form: A := B op C where, A : result address B, C : operand addresses op : operator (1) Quadruple - 4-tuple notation <operator>,<operand1>,<operand2>,<result> (2) Triple - 3-tuple notation <operator>,<operand1>,<operand2> (3) Indirect triple - execution order table & triples
Compiler Lecture Note, Intermediate Language • example • A ← B + C * D / E • F ← C * D
Compiler Lecture Note, Intermediate Language • Note • Quadruple vs. Triple • quadruple - optimization 용이 • triple - removal of temporary addresses ⇒ Indirect Triple • extensive code optimization 용이 • IL rearrange 가능 (triple 제외) • easy translation - source to IL • difficult to generate good code • quadruple to two-address machine • triple to three-address machine
Compiler Lecture Note, Intermediate Language • Abstract Syntax Tree • parse tree에서 redundant한 information 제거. • ┌ leaf node --- variable name, constant └ internal node --- operator • [예제 8] --- Text p.377 { x = 0; y = z + 2 * y; while ((x<n) and (v[x] != z)) x = x+1; return x; } Tree Structured Code
Compiler Lecture Note, Intermediate Language • Tree Structured Common Language(TCOL) • Variants of AST - containing the result of semantic analysis. • TCOL operator - type & context specific operator • Context ┌ value ----- rhs of assignment statement ├ location ----- lhs of assignment statement ├ boolean ----- conditional control statement └ statement ----- statement ex) . : operand --- location result --- value while : operand --- boolean, statement result --- statement
AST: assign b add a 1 Compiler Lecture Note, Intermediate Language Example)int a; float b; ... b = a + 1; Example)int a; float b; ... b = a + 1; • Representation ----- graph orientation ┌ internal notation ------ efficient └ external notation ------ debug, interface linear graph notation TCOL: assign b float addi . 1 a
Compiler Lecture Note, Intermediate Language • Note • AST ----- automatic AST generation(output of parser) ParserGenerator ┌ leaf node specification └ operator node specification • TCOL ----- automatic code generation : PQCC (1) intermediate level: high level --- parse tree like notation control structure low level --- data access (2) semantic specification: dereferencing, coercion, type specific operator dynamic subscript and type checking (3) loop optimization ----- high level control structure easy reconstruction (4) extensibility ----- define new TCOL operator
M front-ends + M compilers for N target machines N back-ends Compiler Lecture Note, Intermediate Language • Motivation • ┌ rapid development of machine architectures └ proliferation of programming languages • portable & adaptable compiler design --- P_CODE • porting --- rewriting only back-end • compiler building system --- EM_CODE Abstract Machine Code
Compiler Lecture Note, Intermediate Language • Model target code interface source program front-end back-end target machine abstract machine code abstract machine interpreter
Compiler Lecture Note, Intermediate Language • Pascal-P Code • Pascal P Compiler --- portable compiler producing P_CODE for an abstract machine(P_Machine). • P_Machine ----- hypothetical stack machine designed for Pascal language. (1) Instruction --- closely related to the PASCAL language. (2) Registers ┌ PC --- program counter │ NP --- new pointer │ SP --- stack pointer └ MP --- mark pointer (3) Memory ┌ CODE --- instruction part └ STORE --- data part(constant area, stack, heap)
PC MP current activation record SP NP stack heap Compiler Lecture Note, Intermediate Language CODE STORE constant area
Compiler Lecture Note, Intermediate Language Ucode • Ucode • the intermediate form used by the Stanford Portable Pascal compiler. • stack-based and is defined in terms of a hypothetical stack machine. • Ucode Interpreter : Appendix B. • Addressing • stack addressing ===> a tuple : (B, O) • B : the block number containing the address • O : the offset in words from the beginning of the block, offsets start at 1. • label • to label any Ucode instruction with a label field. • All targets of jumps and procedures must be labeled. • All labels must be unique for the entire program.
Compiler Lecture Note, Intermediate Language • Example : • Consider the following skeleton : program main procedure P procedure Q var i : integer; j : integer; • block number • main : 1 • P : 2 • Q : 3 • variable addressing • i : (3,1) • j : (3,2)
Compiler Lecture Note, Intermediate Language • Ucode Operations(35개) • Unary --- notop, neg • Binary --- add, sub, mult, divop, modop, swp andop, orop, gt, lt, ge, le, eq, ne • Stack Operations --- lod, str, ldr, ldp • Immediate Operation --- ldc • Control Flow --- ujp, tjp, fjp, cal, ret • Range Checking --- chkh, chkl • Indirect Addressing --- ixa, sta • Procedure Specification --- proc, endop • Program Specification --- bgn • Procedure Calling Sequence --- cal • Symbol Table Information --- sym
Compiler Lecture Note, Intermediate Language • Example : • x = a + b * c; lod 1 1 /* a */ lod 1 2 /* b */ lod 1 3 /* c */ mult add str 1 4 /* x */ • if (a>b) a = a + b; lod 1 1 /* a */ lod 1 2 /* b */ gt fjp next lod 1 1 /* a */ lod 1 2 /* b */ add str 1 1 /* a */ next
Compiler Lecture Note, Intermediate Language • Indirect Addressing • is used to access both array elements and var parameters. • ixa --- indirect load • replace stacktop by the value of the item at location stacktop. • to retrieve A[i] : lod i /* actually (Bi, Oi)) */ ldr A /* also (block number, offset) */ add /* effective address */ ixa /* indirect load gets contents of A[i] */ • to retrieve var parameter x : lod x /* loads address of actual - since x is var */ ixa /* indirect load */
Compiler Lecture Note, Intermediate Language • sta --- indirect store • sta stores stacktop into the address at stack[stacktop-1], both items are popped. • A[i] = j; lod i ldr A add lod j sta • x := y, where x is a var parameter lod x lod y sta
Compiler Lecture Note, Intermediate Language • Procedure Calling Sequence • procedure definition : • procedure A(var a : integer; b,c : integer); • procedure call : • A(x, expr1, expr2); • calling sequence : ldp ldr x /* load the address of actual for var parameter */ … /* code to evaluate expr1 --- left on the stack */ … /* code to evaluate expr2 --- left on the stack */ cal A
Compiler Lecture Note, Intermediate Language • Ucode Interpreter • The Ucode interpreter is called ucodei, it’s source is on plac.dongguk.ac.kr. • The interpreter uses the following files : • *.ucode : file containing the Ucode program. • *.lst : Ucode listing and output from the program. • Ucode format label-field op-code operand-field 1-10 12-m m+2 • m is exactly enough to hold opcode. • label field --- a 10 character label(make sure its 10 characters pad with blanks) • op-code --- starts at 12 column.
Compiler Lecture Note, Intermediate Language Programming Assignment #3 • 부록 B에 수록된 Ucode 인터프리터를 각자 PC에 설치하고 100이하의 소수(prime number)를 구하는 프로그램을 Ucode로 작성하시오. • 다른 문제의 프로그램을 작성해서 제출해도 됨. • Ucode 인터프리터 출력 리스트를 제출. • 참고 : • #1 : recursive-decent parser • #2 : MiniPascal LR parser
Compiler Lecture Note, Intermediate Language • IL criteria • intermediate level • input language --- high level • output machine --- low level • efficient processing • translation --- source to IL, IL to target • interpretation • optimization • extensibility • external representation • clean separation • language dependence & machine dependence Concluding Remarks
A : 좋다 B : 보통이다 C : 나쁘다 Compiler Lecture Note, Intermediate Language