720 likes | 729 Views
CS 3240: Languages and Computation. Course Overview Sasha Boldyreva. Personnel. Instructor: Alexandra (Sasha) Boldyreva Email: aboldyre@cc.gatech.edu Office: Klaus 3444 Office Hours: Tue. & Wed. at 2:00 to 3:00 pm Or by appointment TAs: TBA Email: TBA Office Hours: TBA.
E N D
CS 3240: Languages and Computation Course Overview Sasha Boldyreva
Personnel • Instructor: Alexandra (Sasha) Boldyreva • Email: aboldyre@cc.gatech.edu • Office: Klaus 3444 • Office Hours: • Tue. & Wed. at 2:00 to 3:00 pm • Or by appointment • TAs: TBA • Email: TBA • Office Hours: TBA
Required Textbooks Bundle ISBN# 1418879746, including • “Compiler Construction Principles and Practice” by Kenneth C. Louden, Thompson Course Technology, 1997, ISBN 0534939724 • “Introduction to Theory of Computation, Second Edition” by Michael Sipser, Thompson Course Technology, 2005, ISBN 0534950973
Course Objectives • Formal languages • Understand definitions of regular and context-free languages and their corresponding “machines” • Understand their computational powers and limitations • Compiler concepts • Understand their applications in compilers • Front-end of compiler • Lexical analysis, parsing, semantic analysis • Theory of computation • Understand Turing machines • Understand decidability
Course Syllabus Lexical analysis, scanners, pattern matching Regular expressions, DFAs, NFAs and automata Limits on regular expressions, pumping lemma Practical parsing, LL and LR parsing Context-free languages, grammars, Chomsky Hierarchy Pushdown automata, deterministic vs. non-deterministic Attribute grammars, type inferencing Context-free vs. context-sensitive grammars Decidable vs. Undecidable problems, Turing Machines, Halting Problem Complexity of computation, classes of languages P/NP, space and time completeness
Grading • Homeworks: 25% • Mini-project: 15% • Midterm : 30% • Final: 30% • Homeworks to be submitted in class - hardcopy • No late homework or assignments • Homework should be concise, complete, and precise • Tests will be in class
Class Policies • Students must write solutions to assignments completely independently • General discussions are allowed on assignments among students, but names of collaborators must be reported • Cell phones off, silence please
Resources • Class webpage: see T-Square • Check for schedule changes.
Compilers • What is a compiler? • A program that translates an executable program from source language into target language • Usually source language is high-level language, and target language is object (or machine) code • Related to interpreters • Why compilers? • Programming in machine (or assembly) language is tedious, error prone, and machine dependent • Historical note: In 1954, IBM started developing FORTRAN language and its compiler
Why study theory of compiler? • Besides it is required… • Prerequisite for developing advanced compilers, which continues to be active as new computer architectures emerge • Useful to develop software tools that parse computer codes or strings • E.g., editors, debuggers, interpreters, preprocessors, … • Important to understand how compliers work to program more effectively
How Does Compiler Work? Scanner Request Token Get Token Start Parser • Front End: Analysis of program syntax and semantics Semantic Action Semantic Error Checking Intermediate Representation
Parts of Compilers Focus of this class. Analysis Front End 1. Lexical Analysis 2. Syntax Analysis 3. Semantic Analysis Synthesis Back End 4. Code Generation 5. Optimization
The Big Picture • Parsing: Translating code to rules of grammar. Building representation of code. • Scanning: Converting input text into stream of known objects called tokens. Simplifies parsing process. • Grammar dictates syntactic rules of language i.e., how legal sentence could be formed • Lexical rules of language dictate how legalword is formed by concatenating alphabet.
Overall Operation • Parser is in control of the overall operation • Demands scanner to produce a token • Scanner reads input file into token buffer & forms a token (How?) • Token is returned to parser • Parser attempts to match the token (How?) • Failure: Syntax Error! • Success: • Does nothing and returns to get next token, or • Takes semantic action
Overall Operation • Semantic action: look up variable name • If found okay • If not: put in symbol table • If semantic checks succeed, do code-generation (How?) • Continue to get next token • No more tokens? Done!
Scanning/Tokenization Input File Token Buffer • What does the Token Buffer contain? • Token being identified • Why a two-way ( ) street? • Characters can be read • and unread • Termination of a token
Example main() m
Example main() am
Example main() iam
Example main() niam
Example main() (niam
Example main() niam Keyword: main
Parser • Translating code to rules of a grammar • Control the overall operation • Demands scanner to produce a token • Failure: Syntax Error! • Success: • Does nothing and returns to get next token, or • Takes semantic action
Grammar Rules <C-PROG> MAIN OPENPAR <PARAMS> CLOSEPAR <MAIN-BODY> <PARAMS> NULL <PARAMS> VAR <VAR-LIST> <VARLIST> , VAR <VARLIST> <VARLIST> NULL <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE> VAR <VAR-LIST>; <ASSIGN-STMT> VAR = <EXPR>; <EXPR> VAR <EXPR> VAR<OP><EXPR> <OP> + <OP> - <TYPE> INT <TYPE> FLOAT
Demo main() { int a,b; a = b; } Scanner Token Buffer Parser
Demo main() { int a,b; a = b; } Scanner Token Buffer "Please, get me the next token" Parser
Demo main() { int a,b; a = b; } Scanner m Parser
Demo main() { int a,b; a = b; } Scanner am Parser
Demo main() { int a,b; a = b; } Scanner iam Parser
Demo main() { int a,b; a = b; } Scanner niam Parser
Demo main() { int a,b; a = b; } Scanner (niam Parser
Demo main() { int a,b; a = b; } Scanner niam Parser
Demo main() { int a,b; a = b; } Scanner Token Buffer Token: main Parser
Demo main() { int a,b; a = b; } Scanner Token Buffer Parser "I recognize this"
Parsing (Matching) • Start matching using a rule • When match takes place at certain position, move further (get next token & repeat) • If expansion needs to be done, choose appropriate rule (How to decide which rule to choose?) • If no rule found, declare error • If several rules found, the grammar (set of rules) is ambiguous
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner "Please, get me the next token" Parser
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: MAIN Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner "Please, get me the next token" Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: OPENPAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: CLOSEPAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <PARAMETERS> NULL
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: CLOSEPAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <PARAMETERS> NULL
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: CLOSEPAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: CURLYOPEN Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: INT Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: INT Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: INT Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: VAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST> <VARLIST> NULL
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: ',' [COMMA] Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST> <VARLIST> NULL
Scanning & Parsing Combined main() { int a,b; a = b; } Scanner Token: VAR Parser <C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY> <MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE <DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST> <VARLIST> NULL