140 likes | 274 Views
CPSC 325 - Compiler. Tutorial 1 Scanner & Parser. Scanner and Parser. Input. Scanner. token. Grammar. Parser. Parser Tree (Syntax Tree). Jeremy – noun sees – verb the – determiner cute – adjective monkey – noun. Token – example 1 Jeremy sees the cute monkey.
E N D
CPSC 325 - Compiler Tutorial 1 Scanner & Parser
Scanner and Parser Input Scanner token Grammar Parser Parser Tree (Syntax Tree)
Jeremy – noun sees – verb the – determiner cute – adjective monkey – noun Token – example 1Jeremy sees the cute monkey.
int – type (reserve/key word) x – variable = – assign (reserve/key word) ( – left-prentices (reserve/key word) 3 – digit + – plus (reserve/key word) 21 – digit ) – right-prentices (reserve/key word) * – multiple (reserve/key word) 6 – digit ; – semi-colon/end (reserve/key word) Token – example 2int x = ( 3 + 21 ) * 6;
Parsing • Top-Down Parsing • Bottom-up Parsing • Ambiguous Jeremy sees the cute monkeys sleeps int x = ( 3 + 21 ) * 6;
Grammar Context Free Grammar (CFG)
Simple Debug • Miss the end int x = 3 // “;” missing – add it in • Extra ending int x = 3&; // “&” is extra – remove it • Output the possible part x = 3 + // ??? – what happen here? (Type, second argument, semi-colon, etc.) Note: Some of them is impossible to debug. (For example: misspell, missing argument)
sentence Verb phrase Noun phrase noun verb Prep. phrase I sit prep noun on you Practice • Parse the following: The big dog crush the small kid. double x = y + 3 / 2; // Syntax • Write the grammar for the following:
Practice Parse and write the grammar ( 5 + 2 ) * 4 – ( 1 + 2 )
Lex • Lex will unify the string which fit the patterns • Good for search through a program or a document • You can specify in C or C++ for what action should take when an input string been found.
Lex - Structure • Declarations/Definitions %% • Rules/Production - Lex expression - white space - C statement (optional) %% • Additional Code/Subroutines
Lex – statement Example • %% [+++]+.* ; Remove all of comment in Aldor code. - after lex expression, the C statement is empty. (So, takes no action)
Lex – Basic operators • * - zero or more occurrences • . - “ANY” character • .* - matches any sequence • | - separator • + - one or more occurrences. (a+ :== aa*) • ? - zero or one of something. (b? :== (b+null) • [ ] - choice, so [12345] (1|2|3|4|5) (Note: [*+] represent a choice between star and plus. They lost their specialty. • - - [a-zA-Z] a to z and A to Z, all the letters. • \ - \* matches *, and \. Match period or decimal point.
Lex – Additional Code • %% main() { yylex(); [any C statements] } - When you compile the Lex file, __.lex or __.l, it will generates lex.yy.c file, which define yylex(). - Type “man lex” in UNIX system for more information.