410 likes | 432 Views
Learn about compiler vs interpreter, Java compilation process, phases of compilation, and more in this insightful lecture on CSCE 531 Compiler Construction. Understand the importance of studying compilers and explore the history of programming languages.
E N D
Lecture 1 Overview CSCE 531 Compiler Construction • Topics • Instructions • Readings: January 16, 2018
Overview • Today’s Lecture • Pragmatics • A little History • Compilers vs Interpreter • Data-Flow View of Compilers • Why Study Compilers? • Course Pragmatics • References • Chapter 1 • Appendix A – sample compiler • Lord J.M. Keyne's Aphorism • "What's not worth doing isn't worth doing well."
Course Pragmatics • Instructor: Manton Matthews • Office: 2233 Storey Innovation Center • Office Hours: 11:30-1:30 T-TH others by appointment • Phone: 777-3285 • Email: “mm” at “sc” in domain “edu” • Class Web Site: http://www.cse.sc.edu/class/csce531-001/ • Class Directory on Linux Machines ~matthews/public/class/csce531
"Compilers: Principles, Techniques and Tools,” Aho, Lam, Sethi and Ullman, Addison-Wesley, 2nd edition, 2007. • C reference – Kernighan and Ritchie • K&R • Many others
Class Directory on Linux Boxes • Class directory /acct/matthews/public/csce531/ • Website - https://cse.sc.edu/~matthews/Courses/212/index.html • Submissions – “Moodle dropbox” https://dropbox.cse.sc.edu/
Evolution of Programming • First electronic computers – programmed by plugboard • Von Neumann – concept of stored program computer • Program stored in memory as data (1946) • Fortran – first compiler, John Backus (1957) (BNF) • Lisp 1958 • Cobol 1959 • Algol68, Pascal • C Ritchie 1972 • http://cm.bell-labs.com/cm/cs/who/dmr/chist.html • C++, Java • Perl, python, ruby scripting languages
Compiler vs Interpreter • Compiler • Interpreter
Where’s Java? • Hybrid • Compilation step – translate source to bytecode • Interpretive step – run bytecode program on Java Virtual Machine (JVM)
Example of Dataflow of Compilation Process • Token Stream • TokenCode Lexeme • ID main • LPAREN ( • RPAREN ) • LBRACE { • INT int”keyword” • ID i • COMMA , • ID sum • SEMICOLON ; • … • Source Program in file • main(){ • inti, sum; • for(i=0, sum=0; i < 100; ++i){ • sum = sum + i; • } • printf("Sum=%d\n", sum); • }
Assembly Code • L2: • cmpl $99, -4(%ebp) • jle L5 • jmp L3 • L5: • movl -4(%ebp), %eax • leal -8(%ebp), %edx • addl %eax, (%edx) • leal -4(%ebp), %eax • incl (%eax) • jmp L2 • L3: • movl -8(%ebp), %eax • movl %eax, 4(%esp) • movl $LC0, (%esp) • call _printf • leave • ret • .def _printf; • more ex1.s • .file "ex1.c" • .def ___main; • .text • LC0: • .ascii "Sum=%d\12\0" • .globl _main • .def _main; .scl • _main: • pushl %ebp • movl %esp, %ebp • subl $24, %esp • andl $-16, %esp • movl $0, %eax • movl %eax, -12(%ebp) • movl -12(%ebp), %eax • call __alloca • call ___main • movl $0, -4(%ebp) • movl $0, -8(%ebp)
Machine Code • Loader phase – link in libraries • #include <stdio.h> • /usr/lib/libc.a - standard library
GCC and some of its flags • GCC – Gnu (Gnu is Not Unix) C compiler • gcc phases • Preprocessor only gcc –E ex1.c > ex1.i • Stop with assembly gcc –S ex1.c • Compile only gcc –c ex1.c • Loader - link in libraries • Disaasemblers • objdump
Dynamic vs Static typing • Static means happens at compile time • Dynamic means happens at run time • Python – • x = parseInput(line) • if x < 0 : • sq = “sqrt of x is Not a real number” • else: • sq = sqrt(x) • print (“sq=“, sq)
Parameter Passing • Call by value • Call by reference • Call by copy-restore
Formal Language Theory • Mathematical Models of languages • where a language is just a set of strings of characters from a particular alphabet • Examples • L1 = {legitimate English words} • L2 = {Keywords of C} • L3 = { strings of zeroes and ones that have more zeroes}
Chomsky Hierarchy • Noam Chomsky – famous Linguist came up with the Chomsky Hierarchy • Level 1 – regular languages • Level 2 – context free languages • Level 3 – context sensitive languages • Level 4 – unrestricted languages • We will look at a few of these in our attempts to define efficient compilers
Regular languages (level 1) • Regular Expressions are expressions that represent regular languages • The operators (the base ones are) • Concatentation . Or juxtaposition • Alternation | • Kleene closure * • For a regular expression R the language that is represents is denoted L(R)
Deterministic Finite Automata • A Deterministic finite automaton (DFA) is a mathematical model that consists of • 1. a set of states S • 2. a set of input symbols ∑, the input alphabet • 3. a transition function δ: S x ∑ Sthat for each state and each input maps to the next state • 4. a state s0that is distinguished as the start state • 5. a set of states F distinguished as accepting (or final) states
A complete front-end: Appendix A • A.1 – the Source Language • A.2 Main.java • A.3 – Lexical Analyzer – Tag.java • A.4 Symbol Tables and Types • A.5 Intermediate Code for Expressions • A.6 jumping Code for Bollean Expressions • A.7 – Intermediate Code for Statements • A.8 – Parser • A.9 – Creating the Front End
Tree – directory of source code • tree -a /acct/matthews/Courses/531/Code • /acct/matthews/Courses/531/Code • ├── dragon-book-source-code-master • │ ├── inter • │ │ ├── Access.java • │ │ ├── And.java • │ │ ├── Arith.java • │ │ ├── ... • │ │ └── While.java • │ ├── lexer • │ │ ├── Lexer.java • │ │ ├── Num.java • │ │ ├── Real.java • │ │ ├── Tag.java • │ │ ├── Token.java • │ │ └── Word.java • │ ├── main • │ │ └── Main.java • │ ├── Makefile • │ ├── parser • │ │ └── Parser.java • │ ├── README • │ ├── symbols • │ │ ├── Array.java • │ │ ├── Env.java • │ │ └── Type.java • │ ├── tests • │ │ ├── block1.i • │ │ ├── ... • │ │ ├── prog3.i • │ │ ├── prog3.t • │ │ ├── prog4.i • │ │ └── prog4.t • │ └── tmp • │ ├── block1.i • │ ├── ... • │ └── prog4.i • └── flexbison
Copying and unpacking • $ mkdir 531 • $ cd 531 • $ cp –fr ~matthews/public/csce531/dragon-book-source-code-master.tgz . • $ tar xvfz dragon-book-source-code-master.tgz • $ tree –a .
Make – Unix build utility • Targets – • Dependency list - • Target tree (forest) • Action list - • Minimal recompilation • Usage: • make // make first target • make targ // make target “targ” • make –f makefileNotNamedMakefile • http://www.eecs.umich.edu/courses/eecs380/HANDOUTS/Makefile-Tutorial.html
Makefile Example • CC=gcc • CFLAGS=-DYYDEBUG • LEX=flex • YACC=bison • YACCFLAGS=-t • simple3: simple3.tab.h simple3.tab.o lex.yy.o • $(CC) $(CFLAGS) simple3.tab.o lex.yy.o -ly -o simple3 • simple3.tab.h: simple3.y • bison -d $(YACCFLAGS) simple3.y • simple3.tab.c: simple3.y • bison $(YACCFLAGS) simple3.y • lex.yy.c: simple3.l • flex simple3.l • clean: • -rm *.o lex.yy.c *.tab.[ch] simple[0-9] *.output *.act
Make: Variables and builtin rules • tree: $(OBJS) • $(CC) $(LDFLAGS) -o $(TREE_DEST) $(OBJS) • %.o: %.c tree.h • $(CC) $(CFLAGS) -c -o $@ $< • make –n // print what would be done but don’t do it • make –p // print rules
Appendix A: Front End Makefile • build: compile test • compile: • javaclexer/*.java • javac symbols/*.java • javac inter/*.java • javac parser/*.java • javac main/*.java • yacc: • /usr/ccs/bin/yacc -v doc/front.y • rmy.tab.c • mv y.output doc • …
Makefile: testing target • test: • @for i in `(cd tests; ls *.t | sed -e 's/.t$$//')`;\ • do echo $$i.t;\ • java main.Main <tests/$$i.t >tmp/$$i.i;\ • diff tests/$$i.itmp/$$i.i;\ • done
clean: • (cd lexer; rm *.class) • (cd symbols; rm *.class) • (cd inter; rm *.class) • (cd parser; rm *.class) • (cd main; rm *.class)
Main.java • package main; • import java.io.*; import lexer.*; import parser.*; • public class Main { • public static void main(String[] args) throws IOException { • Lexerlex = new Lexer(); • Parser parse = new Parser(lex); • parse.program(); • System.out.write('\n'); • } • }