190 likes | 210 Views
The Model of Compilation. Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University. Outline. Overview. Front-End Lexical Analysis. Syntactic Analysis. Semantic Analysis. Back-End Code Generation. Code Optimization. Translator. source program (S).
E N D
The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline • Overview. • Front-End • Lexical Analysis. • Syntactic Analysis. • Semantic Analysis. • Back-End • Code Generation. • Code Optimization.
Translator sourceprogram (S) objectprogram (O) error messages Overview • Translate a “source” program (in language S) into an “equivalent” “object” program (in language O).
The Model of Compilation • Reduce Complexity • Source/Target Independent • Plug-able Compiler • IR: contain sufficient information • tree-like structure the “syntax tree” or • Assembly-like format “three-address code”. Analysis (Front-End) Synthesis (Back-End) IntermediateRepresentation source object
Front-End • Lexical Analysis • group the input stream into tokens • Syntactic Analysis • see if the source is “valid” or “correct” • Contextual/Semantic Analysis • make sure the program is “meaningful” or semantically correct.
Front-End Front-End Components Group token. Scanner Source program (text stream) identifier main symbol ( m a i n ( ) { token next-token Construct parse tree. Symbol Table Parser parse-tree Check semantic/contextual. Semantic Analyzer Intermediate Representation (file or in memory)
Lexical Analysis • Scanner. • Group the input stream into tokens • identifiers. • numbers. • keywords. • symbols & signs. • Lexeme: Character sequence forming a token. • Eliminate all blanks and comments.
1 2 3 4 5 6 7 Example: Tokens position := initial + rate * 60 1. identifier position 2. assignment symbol := 3. identifier initial 4. plus symbol + 5. identifierrate 6. muliplication symbol* 7. integer-literal60
Syntax Analysis • Parser. • Check if the source is “grammatically” correct. • Construct a parse tree.
Mini-Triangle Syntax single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command end
Mini-Triangle Syntax Expression ::= primary-Expression | Expression Operator primary-Expression primary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier ... Operator ::= + | - | * | / | < | > | = | \
Semantic Analysis • Make sure that the program is “meaningful”. • Walk the parse tree to check • Type checking. • Type conversion. • Example: rate * 60 • rate is a real variable rate * inttoreal(60) • Generate IR (can also done by parser).
Example of IRAbstract Syntax Tree (AST) position := initial + rate * 60 := position + initial * rate 60 interior node = operation children = arguments leaves = identifiers or constants
Example of IRThree-Address Code tmp := rate * 60 tmp := initial + tmp position := tmp position := initial + rate * 60
Back-End • Code Optimization • improve IR: machine-independent. • improve object code: machine-depedent. • optimizing compiler. • widely-used. • Code Generation • generate object code. • assign memory/register locations. • instruction selection.
Back-End Front-End Components Machine-independent optimization. IR Optimizer Intermediate Representation (file or in memory) IR Generate object code. Symbol Table Code Generator Machine-dependent optimization Object code Peephole Optimizer Object code (assembly or binary)
Other Phases • Symbol-Table Management • information about identifier being-used. • name • type • scope • Scanner creates an entry into the table. • Error Handler • what to do when found errors in the source.
Compiler-Construction Tools • Parser generators. • Generate a parser from a CFG. • Yacc, Bison. • Scanner generators. • Generate a scanner from regular expressions. • Lex, Flex.