730 likes | 865 Views
CST 320 Compiler Methods. Week 1. Introduction Go over syllabus Grammar Review Compiler Overview Preprocessor Symbol Table Preprocessor Directives Adding a lexical analyzer. Instructor. Sherry Yang sherry.yang@oit.edu or csetyang@gmail.com Wilsonville Room 213
E N D
Week 1 • Introduction • Go over syllabus • Grammar Review • Compiler Overview • Preprocessor • Symbol Table • Preprocessor Directives • Adding a lexical analyzer
Instructor • Sherry Yang • sherry.yang@oit.edu or csetyang@gmail.com • Wilsonville Room 213 • Office Hours: Mon/Thurs 4-6 or by appointment • Class webpage: http://www.oit.edu/faculty/sherry.yang/CST320
Instructor Background Professor of Software Engineering Technology Department of Computer Systems Engineering Technology Klamath Falls Ph.D. in Computer Science Senior Software Engineer Application Software Engineer
Getting to Know Each Other • Pair up with one other person. • Find out a little more about the person. • Name • Year in program • Something interesting about the person • Any previous compiler experience • Introduce the person to the class.
Course Description • This course is designed to introduce the basic concepts of compiler design and operation. Topics include lexical and syntactical analysis, parsing, translation, semantic processing and code generation. In addition, students will implement a small compiler. • We might use other tools (Spirit, Pargen, etc.)
Evaluation Methods 2 Tests 40% Homework & Labs 35% Project 15% Class Participation 10% (including in-class exercises)
Grading Your grade will be calculated as follows:* 90%+ = A 80%+ = B 70%+ = C 60%+ = D 59%- = F • * Class participation will be considered in evaluating "borderline" grades. • † You must turn in ALL of the labs and complete the project to pass the course with a C or better. • Incompletes will be given if you failed to turn in all labs and project.
Textbook • Text: • Cooper, Keith D. & Linda Torczon, Engineering A Compiler, 2nd edition, Morgan Kaufmann, 2012. • References: • Parsons, Introduction to Compiler Construction • Aho, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools, Addison-Wesley, 1986. • Fischer and LeBlanc, Crafting a Compiler with C, Benjamin Cummings, 1991.
Student Responsibilities • Lecture and Lab Attendance: • Students are expected to attend all class sessions. If you know you will be absent on a certain day, please inform the instructor in advance so arrangements can be made to provide you with the materials covered. Please make every effort to attend all class sessions. There will be no make up in-class exercises. • Lab sessions will be used as help sessions and to check off lab assignments.
Student Responsibilities • Tests: • All tests are open book, open notes. No electronic devices are allowed. • There will be no make up tests unless there is an emergency. If you miss a test for any reason, you can do additional project to make it up. • In case of emergency, please contact Student Affairs office. They will inform all of your instructors.
Student Responsibilities • Academic Dishonesty: • No plagiarism or cheating is allowed in this class. Please refer to your student handbook regarding policies on academic dishonesty. A copy of the policy is posted on the class webpage. • It is okay to get help on your assignments. Please acknowledge all source of help, including them in the program documentation as appropriate.
Student Responsibilities • Homework & Labs: • All labs are due via email by midnight on the due date. You must follow the assignment submission guidelines below. • All labs must be checked off by the instructor. There will be a check-off list posted for each lab.
Lab Submission Guidelines • All labs are due via email by midnight on the due date. The instructor will send out an email upon receiving your lab. If you do not receive an email within 24 hours of submitting the lab, it is YOUR responsibility to contact the instructor by email or phone. If you do not contact the instructor within 48 hours after the due date, the lab is considered late. • There will be a 20% penalty per week for late labs. • All labs, project and late labs must be turned in by Wednesday of Finals week to be graded.
Lab Submission Guidelines • 1. Zip up all files required to build the lab. • 2. Include a “Readme” file as appropriate. • 3. The archive should also include any other deliverables as called out in the assignment write-up. • 4. The archive will be attached to an email with subject line: CST320 Lab #x – first name last name • Email the archive to csetyang@gmail.com
Any student with a disability who anticipates a need for accommodation in this course is encouraged to talk with the instructor about his/her needs as soon as possible.
Grammar Review • Three main concepts • Language • Machine • Grammar • Regular vs. Context-Free Languages • Notation for describing languages • Regular Expression • Context-Free grammar • Recognizers • Finite automata • Pushdown Automata
In-Class Exercise#1 Given ∑={0, 1} L1 = { wv | w, v ∈ ∑* and v = 00}. Define a regular expression to describe L1.
In-Class Exercise#1 Given ∑={0, 1} L2 = {w| w ∈ ∑* and w contains 3 consecutive 0’s}. Define a deterministic finite automata (DFA) to recognize this language.
In-Class Exercise#1 Given ∑={0, 1} Lp = {wwr | w ∈ ∑*}. Define a context-free grammar for Lp.
In-Class Exercise#1 Given ∑={0, 1} Lp = {wwr | w ∈ ∑*}. Define a context-free grammar for Lp. Is Lp regular?
In-Class Exercise#1 Find the regular expressions for the following automata. Is this a deterministic finite automata?
In-Class Exercise#1 Remove lambda productions from the following grammar: S -> ABc A -> aaA A -> λ B -> B b B -> λ
Conventional Translator source program Modified source program preprocessor compiler target assembly program absolute machine code loader / linker assembler relocatable machine code library, relocatable object files
Compilers Lexical Analyzer (scanner) Tokens Parse Tree Semantic Analysis Source Program Parser Intermediate Representation Uses Context-Free Grammar to define program structures Is a Pushdown Automata Structure of program isContext-Free Uses Regular Expressions to define tokens Is a Finite Automata Structure of tokens isRegular Optimizer Code Generator Target code Symbol Table
Why study compilers? • Ties lots of things you know together: • Theory (finite automata, grammars) • Data structures • Modularization • Utilization of software tools • You might build a parser. • The theory of computation/formal language still applies today. • As long as we still program with 1-D text. • Helps you to be a better programmer
One-dimensional Text int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; The formatting has no impact on the meaning of program int x;cin >> x;if(x>5) cout << “Hello”; else …
What is a translator? • Takes input (SOURCE) and produces output (TARGET) SOURCE TARGET ERROR
Conventional Translator skeletal source program source program preprocessor compiler target assembly program absolute machine code loader / linker assembler relocatable machine code library, relocatable object files
Translator for Java Java source code Java bytecode Java interpreter Java compiler Java bytecode Bytecode compiler absolute machine code
Types of Translators • Compilers • Conventional (textual source code) • Imperative, ALGOL-like languages • Other paradigms • Interpreters • Macro processors • Text formatters • Silicon compilers
Types of Translators (cont.) • Visual programming language • Interface • Database • User interface • Operating System
Conventional Translator skeletal source program source program preprocessor compiler target assembly program absolute machine code loader / linker assembler relocatable machine code library, relocatable object files
Structure of Compilers skeletal source program preprocessor Modified Source Program Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code
Symbol Table • What is a symbol? • Variable name • Function name • Type name • Constant • Class name • Method name • …. • Any ID that you use in a program
Symbol Table • Information about a symbol • Name • Type (int, double, char, string, etc.) • Use (variable name, constant name, type name, function name, etc.) • Value (i.e. value of constant) • Scope
Symbol table operations • Insert a symbol into the symbol table • Flag as error if symbol already exists in some cases • Search for a symbol in the symbol table • Delete a symbol from the symbol table
Symbol table examples w/ preprocessor #define MAX 50 #define SOMESYMBOL #define SOMESYMBOL #undef SOMESYMBOL #define MIN 10 #define MAX 100
Code example #define MAX 5 void main() { int x; int y; x = MAX; #define MAX 10 y = MAX; }
Symbol table example w/ parser (lab 2) void main() { int x; string str1; int x; x = 3; y = 10; str1 = 30; { double x; x = 4.301; } }
Preprocessor • Remove all comments • If a language is not case sensitive, preprocessor may change the program text to all uppercase or all lowercase. • Process preprocessor directives. • C/C++ directives: • #include • #define (unlike C#’s #define, C/C++ can define a constant value) • #if / #else / #endif • #undef • #ifdef • #ifndef skeletal source program source program preprocessor
#include #include “b.h” #define MIN 10 … int x; if (x < MIN) … x = MAX; #define MAX 5 b.h a.h
#ifdef #ifndef A_H #define A_H … #endif
#ifdef #if DLEVEL == 0 #define STACK 0 #elif DLEVEL == 1 #define STACK 100 #elif DLEVEL > 5 display( debugptr ); #else #define STACK 200 #endif
Standalone Preprocessor void main() { int x; x = 50; int y; x = y – 10; } #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah } preprocessor Produces a modified source file temp.cpp input.cpp
Standalone Lexical Analyzer void main() { int x; x = 50; int y; x = y – 10; } Lexical Analyzer ( symbol void keyword main ID ) symbol { symbol int keyword Produces a list of tokens
Preprocessor & Lexical Analyzer #define MAX 50 //this is a comment void main() { int x; //more comments x = MAX; #define MIN 10 int y; x = y – MIN; //blah } both ( symbol void keyword main ID ) symbol { symbol int keyword Produces a list of tokens
Output from Lab1 Print out of tokens: void keyword main ID ( symbol ) symbol { symbol Int keyword ….. ( symbol void keyword main ID ) symbol { symbol int keyword List of tokens
Preprocessor • Preprocessor symbols • Defined by #define • #define MYHEADER_H • #define LARGEST 10 • Defined in the compilation process • Command Line (/D) • Preprocessor Definitions
In-Class Exercise #2 1. #include <iostream> 2. //comment 3. #define LARGEST 100 4. void main() 5. { int x, y; 6. x = 10; 7. y = LARGEST; 8. #ifdef MYSYMBOL 9. cout << "X=" << x; 10. #endif 11. #if TEST == 1 12. cout << "1" << endl; 13. #elif TEST == 2 14. cout << "2" << endl; 15. #else 16. cout << "Blah" << endl; 17. #endif 18. cout << “The end” << endl; } Show result of preprocessor What’s left in the file? What’s changed in the file?