300 likes | 423 Views
CSC 415: Translators and Compilers Spring 2009. Dr. Chuck Lillie. Course Overview. Translators and Compilers Textbook Programming Language Processors in Java , Authors: David A. Watts & Deryck F. Brown, 2000, Prentice Hall Syllabus http://www.uncp.edu/home/lilliec Homework & Project
E N D
CSC 415: Translators and CompilersSpring 2009 Dr. Chuck Lillie
Course Overview • Translators and Compilers • Textbook • Programming Language Processors in Java, Authors: David A. Watts & Deryck F. Brown, 2000, Prentice Hall • Syllabus • http://www.uncp.edu/home/lilliec • Homework & Project • First half of semester • Problems • Second half of semester • Modify Triangle compiler
Course Outline • Major Programming Project • Project Definition and Planning • Implementation • Weekly Status Reports • Project Presentation • Translators and Compilers • Language Processors • Compilation • Syntactic Analysis • Contextual Analysis • Run-Time Organization • Code Generation • Interpretation
Project • Modify a compiler for the programming language Triangle • Appendix B: Informal Specification of the Programming Language Triangle • Appendix D: Class Diagrams for the Triangle Compiler • Present Project Plan • What and How • Weekly Status Reports • Work accomplished during the reporting period • Deliverable progress, as a percentage of completion • Problem areas • Planned activities for the next reporting period
CSC 415: Translators and CompilersSpring 2009 Chapter 1 Introduction to Programming Languages
Chapter 1: Introduction to Programming Languages • Programming Language: A formal notation for expressing algorithms. • Programming Language Processors: Tools to enter, edit, translate, and interpret programs on machines. • Machine Code: Basic machine instructions • Keep track of exact address of each data item and each instruction • Encode each instruction as a bit string • Assembly Language: Symbolic names for operations, registers, and addresses.
Programming Languages • High Level Languages: Notation similar to familiar mathematical notation • Expressions: +, -, *, / • Data Types: truth variables, characters, integers, records, arrays • Control Structures: if, case, while, for • Declarations: constant values, variables, procedures, functions, types • Abstraction: separates what is to be performed from how it is to be performed • Encapsulation (or data abstraction): group together related declarations and selectively hide some
Programming Languages • Any system that manipulates programs expressed in some particular programming language • Editors: enter, modify, and save program text • Translators and Compilers: Translates text from one language to another. Compiler translates a program from a high-level language to a low-level language, preparing it to be run on a machine • Checks program for syntactic and contextual errors • Interpreters: Runs program without compilation • Command languages • Database query languages
Programming Languages Specifications • Syntax • Form of the program • Defines symbols • How phrases are composed • Contextual constraints • Scope: determine scope of each declaration • Type: ensures each operation is supplied with operands of the correct type • Semantics • Meaning of the program – behavior when run on a machine
Representation • Syntax • Backus-Naur Form (BNF): context-free grammar • Terminal symbols (>=, while, ;) • Non-terminal symbols (Program, Command, Expression, Declaration) • Start symbol (Program) • Production rules (defines how phrases are composed from terminals and sub-phrases) • N::=a|b|…. • Syntax Tree • Used to define language in terms of strings and terminal symbols
Representation • Semantics • Abstract Syntax • Concentrate on phrase structure alone • Abstract Syntax Tree
Contextual Constraints • Scope • Binding • Static: determined by language processor • Dynamic: determined at run-time • Type • Statically: language processor can detect all errors • Dynamically: type errors cannot be detected until run-time Will assume static binding and statically typed
Semantics • Concerned with meaning of program • Behavior when run • Usually specified informally • Declarative sentences • Could include side effects • Correspond to production rules
Structure of a Compiler Lexical Analyzer Source code Symbol Table tokens Parser & Semantic Analyzer parse tree Intermediate Code Generation intermediate representation Optimization intermediate representation Assembly Code Generation Assembly code
Program Command Single-Command Expression ::= single-Command ::= single-Command | Command ; single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command end ::= primary-Expression | Expression Operator primary-Expression Mini-Triangle Syntax
Primary-Expression V-name Declaration Single-Declaration Type-Denoter Operator Identifier Integer-Literal Comment Digit Letter ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) ::= Identifier ::= single-Declaration | Declaration ; single-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoter ::= Identifier ::= + | - | * | / | < | > | = | \ ::= Letter | Identifier Letter | Identifier Digit ::= Digit | Integer-Literal Digit ::= ! Graphic* eol ::= 0|1|2|3|4|5|6|7|8|9 ::= a|b|c|d|…|z|A|B|C|…|Z Mini-Triangle Syntax
Syntax Tree – let var y: Integer in y := y + 1 Program single-Command single-Command Expression Declaration Expression primary-Expression primary-Expression single-Declaration Type-denoter V-name V-name Integer-Literal Identifier Operator Identifier Identifier Identifier y var y y : Integer in let 1 := +
Representation • Semantics • Abstract Syntax • Concentrate on phrase structure alone • Abstract Syntax Tree
Mini-Triangle Program ! This is a comment. It continues to the end-of-line let const m ~ 7; var n: Integer in begin n := 2 * m * m; putint (n) end
Mini-Triangle Terminal Symbols begin const do else end if in let then var while ; : := ~ ( ) + - * / < > = \
Mini-Triangle Non-Terminals Program (start symbol) Command single-Command Expression primary-Expression V-name Declaration single-Declaration Type-denoter Operator identifier Integer-Literal
Program Command single-Command Expression Primary-Expression ::= single-Command ::= single-Command | Command ; single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command end := primary-Expression | Expression Operator primary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) Mini-Triangle Production Rules
V-name Declaration single-Declaration Type-denoter Operator Identifier Integer-Literal Comment ::= Identifier ::= single-Declaration | Declaration ; single-Declaratiion ::= const Identifier ~ Expression | var Identifier : Type-denoter ::= Identifier ::= + | - | * | / | < | > | = | \ ::= Letter | Identifier Letter | Identifier Digit ::= Digit |Integer-Literal Digit ::= ! Graphic* eol Mini-Triangle Production Rules
Program Command Expression ::= Command ::= V-name := Expression | Identifier ( Expression ) | Command ; Command | if Expression then Command else Command | while Expression do Command | let Declaration in Command ::= Integer-Literal | V-name | Operator Expression | Expression Operator Expression Mini-Triangle Abstract Syntax Label Program AssignCommand CallCommand SequentialCommand IfCommand WhileCommand LetCommand IntegerExpression VnameExpression UnaryExpression BinaryExpression
V-name Declaration Type-Denoter ::= Identifier ::= const Identifier ~ Expression | var Identifier : Type-denoter | Declaration ; Declaration ::= Identifier Mini-Triangle Abstract Syntax Label SimpleVname ConstDeclaration VarDeclaration SequentialDeclaration SimpleTypeDenoter
Abstract Syntax Tree – let var y: Integer in y := y + 1 Program LetCommand AssignmentCommand BinaryExpression VarDeclaration Expression IntegerExpression VnameExpression SimpleTypeDenoter SimpleVname SimpleVname Integer-Literal Identifier Operator Identifier Identifier Identifier y y y Integer 1 +
Semantics • Concerned with the meaning of the program • Their behavior when run • Specifying semantics • Specify in general terms what will be the semantics of each class of phrase in the language • Semantics of commands, expressions, and declarations • A command is executed to update variables • May also have side effect of performing input-output • An expression is evaluated to yield a value • May also have side effect of updating variables • A declaration is elaborated to produce bindings • May also have the side effect of allocating and initializing variables • Specify the semantics of each specific form of command, expression, declaration, and so on • One clause for each form of phrase
Mini-Triangle Semantics • A command C is executed in order to update variables (this includes input and output) • The assignment statement V := E is executed as follows. The expression E is evaluated to yield a value v; then v is assigned to the value-or-variable-name V. • The call-command I (E) is executed as follows. The expression E is evaluated to yield a value v; then the procedure bound to I is called with v as its argument. • The sequence command C1 ; C2 is executed as follows. First C1 is executed; then C2 is executed.
Mini-Triangle Semantics (cont) • A command C is executed in order to update variables (this includes input and output) cont… • The if-command if E then C1 else C2 is executed as follows. The expression E is evaluated to yield a truth-value t; If t is true, C1 is executed; if t is false, C2 is executed. • The while-command while E do C is executed as follows. The expression E is evaluated to yield a truth-value t; if t is true, C is executed, and then the while-command is executed again; if t is false, execution of the while-command is completed.
Mini-Triangle Semantics (cont) • A command C is executed in order to update variables (this includes input and output) cont… • The let-command let D in C is executed as follows. The declaration D is elaborated to produce bindings b; C is executed, in the environment of the let-command overlaid by the bindings b. The bindings b have no effect outside the let-command.