560 likes | 748 Views
Parsing. Front-End: Parser. Checks the stream of words and their parts of speech for grammatical correctness. tokens. source code. IR. scanner. parser. errors. Front-End: Parser. Determines if the input is syntactically well formed. tokens. source code. IR. scanner. parser. errors.
E N D
Front-End: Parser • Checks the stream of words and their parts of speech for grammatical correctness tokens sourcecode IR scanner parser errors
Front-End: Parser • Determines if the input is syntactically well formed tokens sourcecode IR scanner parser errors
Front-End: Parser • Guides context-sensitive (“semantic”) analysis (type checking) tokens sourcecode IR scanner parser errors
Front-End: Parser • Builds IR for source program tokens sourcecode IR scanner parser errors
Syntactic Analysis • Natural language analogy: consider the sentence He wrote the program
Syntactic Analysis He wrote the program noun verb article noun
Syntactic Analysis He wrote the program noun verb article noun subject predicate object
Syntactic Analysis • Natural language analogy He wrote the program noun verb article noun subject predicate object sentence
Syntactic Analysis • Programming language if ( b <= 0 ) a = b assignment bool expr if-statement
Syntactic Analysis syntax errors int* foo(int i, int j)) { for(k=0; i j; ) fi( i > j ) return j; }
Compiler Construction Sohail Aslam Lecture 11
Syntactic Analysis int* foo(int i, int j)) { for(k=0; i j; ) fi( i > j ) return j; } extra parenthesis Missing expression not a keyword
Semantic Analysis • Grammatically correct He wrote the computer noun verb article noun subject predicate object sentence
Semantic Analysis • semantically (meaning) wrong! He wrote the computer noun verb article noun subject predicate object sentence
Semantic Analysis int* foo(int i, int j) { for(k=0; i < j; j++ ) if( i < j-2 ) sum = sum+i return sum; } undeclared var return type mismatch
Role of the Parser • Not all sequences of tokens are program. • Parser must distinguish between valid and invalid sequences of tokens.
Role of the Parser • Not all sequences of tokens are program. • Parser must distinguish between valid and invalid sequences of tokens.
Role of the Parser What we need • An expressive way to describe the syntax • An acceptor mechanism that determines if input token stream satisfies the syntax
Role of the Parser What we need • An expressive way to describe the syntax • An acceptor mechanism that determines if input token stream satisfies the syntax
Role of the Parser What we need • An expressive way to describe the syntax • An acceptor mechanism that determines if input token stream satisfies the syntax
Study of Parsing • Parsing is the process of discovering a derivation for some sentence
Study of Parsing • Mathematical model of syntax – a grammar G. • Algortihm for testing membership in L(G).
Study of Parsing • Mathematical model of syntax – a grammar G. • Algortihm for testing membership in L(G).
Context Free Grammars A CFG is a four tuple G=(S,N,T,P) • S is the start symbol • N is a set of non-terminals • T is a set of terminals • P is a set of productions
Why Not Regular Expressions? Reason:regular languages do not have enough power to express syntax of programming languages.
Limitations of Regular Languages • Finite automaton can’t remember number of times it has visited a particular state
Example of CFG • Context-free syntax is specified with a CFG
Example of CFG • ExampleSheepNoise → SheepNoise baa| baa • This CFG defines the set ofnoises sheep make
Example of CFG • We can use the SheepNoise grammar to create sentences • We use the productions as rewriting rules
Example of CFG SheepNoise → SheepNoise baa| baa
Example of CFG SheepNoise → SheepNoise baa| baa
Example of CFG And so on ...
Example of CFG • While it is cute, this example quickly runs out intellectual steam • To explore uses of CFGs, we need a more complex grammar
Example of CFG • While it is cute, this example quickly runs out intellectual steam • To explore uses of CFGs, we need a more complex grammar
Backus-Naur Form (BNF) • Grammar rules in a similar form were first used in the description of the Algol60 Language.
Backus-Naur Form (BNF) • The notation was developed by John Backus and adapted by Peter Naur for the Algol60 report. • Thus the term Backus-Naur Form (BNF)
Backus-Naur Form (BNF) • The notation was developed by John Backus and adapted by Peter Naur for the Algol60 report. • Thus the term Backus-Naur Form (BNF)
Derivation: • Let us use the expression grammar to derive the sentence x – 2 * y
Derivation • Such a process of rewrites is called a derivation. • Process or discovering a derivations is called parsing
Derivation • Such a process of rewrites is called a derivation. • Process or discovering a derivations is called parsing
Derivation We denote this derivation as:expr→*id – num* id
Derivations • At each step, we choose a non-terminal to replace • Different choices can lead to different derivations.
Derivations • At each step, we choose a non-terminal to replace • Different choices can lead to different derivations.
Derivations • Two derivations are of interest • Leftmost derivation • Rightmost derivation
Derivations • Leftmost derivation: replace leftmost non-terminal (NT) at each step • Rightmost derivation: replace rightmost NT at each step
Derivations • Leftmost derivation: replace leftmost non-terminal (NT) at each step • Rightmost derivation: replace rightmost NT at each step