240 likes | 348 Views
Course Overview. Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc0 2 .html Textbook:Modern Compiler Implementation in C Andrew Appel ISBN 0-521-58390-X CS 0368-4452-01@listserv.tau.ac.il. Outline.
E N D
Course Overview Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Andrew Appel ISBN 0-521-58390-X CS 0368-4452-01@listserv.tau.ac.il
Outline • High level programming languages • Interpreter vs. Compiler • Abstract Machines • Why study compilers? • Main Compiler Phases
High Level Programming Languages • Imperative • Algol, PL1, Fortran, Pascal, Ada, Modula, and C • Closely related to “von Neumann” Computers • Object-oriented • Simula, Smalltalk, Modula3, C++,Java, C# • Data abstraction and ‘evolutionary’form of program development • Class An implementation of an abstract data type (data+code) • Objects Instances of a class • Fields Data (structure fields) • Methods Code (procedures/functions with overloading) • Inheritance Refining the functionality of a class with different fields and methods • Functional • Lisp, Scheme, ML, Miranda, Hope, Haskel • Logic Programming • Prolog
Other Languages • Hardware description languages • VHDL • The program describes Hardware components • The compiler generates hardware layouts • Shell-languages Shell, C-shell, REXX • Include primitives constructs from the current software environment • Graphics and Text processing TeX, LaTeX, postscript • The compiler generates page layouts • Web/Internet • HTML, MAWL, Telescript, JAVA • Intermediate-languages • P-Code, Java bytecode, IDL, CLR
source-program program’s input program’s input Interpreter • Input • A program • An input for the program • Output • The required output interpreter
int x; scanf(“%d”, &x); x = x + 1 ; printf(“%d”, x); 5 6 Example C interpreter
source-program program’s input object-program program’s input Compiler • Input • A program • Output • An object program that reads the input and writes the output compiler
int x; scanf(“%d”, &x); x = x + 1 ; printf(“%d”, x); 6 5 Example Sparc-cc-compiler add %fp,-8, %l1 mov %l1, %o1 call scanf ld [%fp-8],%l0 add %l0,1,%l0 st %l0,[%fp-8] ld [%fp-8], %l1 mov %l1, %o1 call printf assembler/linker object-program
Conceptually simpler (the definition of the programming language) Easier to port Can provide more specific error report Normally faster More efficient Compilation is done once for all the inputs --- many computations can be performed at compile-time Sometimes evencompile-time + execution-time < interpretation-time Can report errors before input is given Interpreter vs. Compiler
Interpreters provide specific error report scanf(“%d”, &y); if (y < 0) x = 5; ... if (y <= 0) z = x + 1; • Input-program • Input data y=0
scanf(“%d”, &x); y = 5 ; z = 7 ; x = x +y*z; printf(“%d”, x); Compilers are usually more efficient Sparc-cc-compiler add %fp,-8, %l1 mov %l1, %o1 call scanf mov 5, %l0st %l0,[%fp-12] mov 7,%l0 st %l0,[%fp-16] ld [%fp-8], %l0 ld [%fp-8],%l0 add %l0, 35 ,%l0 st %l0,[%fp-8] ld [%fp-8], %l1 mov %l1, %o1 call printf
Compilers can provide errors beforeactual input is given int a[100], x, y ; scanf(“%d”, y) ; if (y < 0) /* line 4*/ y = a ; • Input-program • Compiler-Output“line 4: improper pointer/integer combination: op =''
Compilers can provide errors beforeactual input is given scanf(“%”, y); if (y < 0) x = 5; ... if (y <= 0) /* line 88 */ z = x + 1; • Input-program • Compiler-Output “line 88: x may be used before set''
Pascal Program program’s input Abstract Machines • A compromise between compilers and interpreters • An intermediate program representation • The intermediate representation is interpreted • Example: Zurich P4 Pascal Compiler(1981) • Other examples: Java bytecode, MS .NET • The intermediate code can be compiled Pascal compiler P-code interpreter program’s input
Why Study Compilers • Become a compiler writer • New programming languages • New machines • New compilation modes: “just-in-time” • Using some of the techniques in other contexts • Design a very big software program using a reasonable effort • Learn applications of many CS results (formal languages, decidability, graph algorithms, dynamic programming, ... • Better understating of programming languages and machine architectures • Become a better programmer
Course Requirements • Compiler Project 35% • Develop a Tiger Front-End in C • Two parts: • Lex+Yacc (Chapter 2, 3, 4) • Semantic analysis (5, 12) • Tight schedule • Bonus 10% • Theoretical Exercises 15% • Final exam 50%
Compiler Phases • The compiler program is usually written as sequence of well defined phases • The interfaces between the phases is well defined (another language) • It is sometimes convenient to use auxiliary global information (e.g., symbol table) • Advantages of the phase separation: • Modularity • Simplicity • Reusabilty
Source program (string) Basic Compiler Phases Finite automata lexical analysis Tokens Pushdown automata syntax analysis Abstract syntax tree semantic analysis Memory organization Translate Intermediate representation Instruction selection Dynamic programming Assembly Register Allocation graph algorithms Fin. Assembly
Example:straight-line programming Stm ::=Stm ; Stm //(CompoundStm) Stm ::=id := Exp // (AssignStm) Stm ::= print (ExpList) // (PrintStm) Exp ::= id // (IdExp) Exp ::= num // (NumExp) Exp ::= Exp Binop Exp // (OpExp) Exp ::= (Stm, Exp) // (EseqExp) ExpList ::= Exp, ExpList // (PairExpList) ExpList ::= Exp // (LastExpList) Binop ::= + // (Plus) Binop ::= - // (Minus) Binop ::= * // (Times) Binop ::= / // (Div)
Example Input a := 5 + 3; b := ( print(a, a-1), 10 * a); b := print(b)
Questions • How to check that a program is correct? • How to internally represent the compiled program?
Lexical Analysis • Input string • Tokens a\b := 5 + 3 ;\nb := (print(a, a-1), 10 * a) ;\nprint(b) id (“a”) assignnum (5) + num(3) ; id(“b”) assign print(id(“a”) , id(“a”) - num(1)), num(10) * id(“a”)) ; print(id(“b”))
Syntax Analysis id (“a”) assignnum (5) + num(3) ; id(“b”) assign print(id(“a”) , id(“a”) - num(1)), num(10) * id(“a”)) ; print(id(“b”)) • Tokens • Abstract Syntax tree CompoundStm CompoundStm AssignStm AssignStm opExp id eseqExp id numExp numExp opExp Plus a PrintStm b 5 3
Summary • Phases drastically simplifies the problem of writing a good compiler • The Textbook offers a reasonable partition into phases with interface definition (in C) • Every week we will study a new compiler phase