1 / 27

Chapter 3

Chapter 3. Context-Free Grammar and Parsing. The Parsing Process. Parsing is the task of determining the syntax, or structure, of a program, so it is called syntax analysis. The syntax of a programming language is usually given by the grammar rules of a context free-grammar.

Download Presentation

Chapter 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 3 Context-Free Grammar and Parsing

  2. The Parsing Process • Parsing is the task of determining the syntax, or structure, of a program, so it is called syntax analysis. • The syntax of a programming language is usually given by the grammar rules of a context free-grammar. • The rules of context free grammar is recursive. • The data structures used to represent the syntactic structure of a language is called parse tree or syntaxtree. • Two general categories of parsing algorithms: top-down parsing and bottom-up parsing. • The parsing process may be viewed as Sequence of Tokens Syntax Tree Parser CS302 Ch3

  3. Context-Free Grammar Terminology • An alphabet or set of basic symbols (like regular expressions, only now the symbols are whole tokens, not chars), including . (Terminals) • A set of names for structures (like statement, expression, definition). (Non-terminals) • A set of grammar rules expressing the structure of each name. (Productions) • A start symbol (the name of the most general structure  compilation unit in C). CS302 Ch3

  4. Context-Free Grammars • Example: exp exp op exp І( exp )Іnumber op +І–І* • Names are written in italic. • Choice and concatenation similar as regular expression. • Repetition represented by recursion. • Arrows replaces the equal sign. • Grammar rules in this form is called Backus-Naur Form or BNF notation. CS302 Ch3

  5. 2 non-terminals 6 terminals “Base” rule Recursive rules Example expexp op exp | (exp ) | number op + | - | * 6 productions (3 on each line) • In what way does such a Context-Free Grammar differ from a regular expression? • digit = 0|1|…|9 • number = digit digit* • Recursion! CS302 Ch3

  6. Derivations • A derivation is a sequence of replacements of structure names by choices on the right-hand sides of grammar rules. • The arithmetic expression (34 – 3)*42 corresponds to the legal string (number – number)*number • (1) exp exp op exp [exp exp op exp] (2) exp op number [exp number] (3) exp * number [op *] (4) (exp) * number [exp (exp)] (5) (exp op exp)*number [exp exp op exp] (6) (exp op number)*number [exp number] (7) (exp – number)*number [op - ] (8) (number – number)*number [exp number] CS302 Ch3

  7. Abstract the Structure of Derivation to a Parse Tree CS302 Ch3

  8. Definitions • Start symbol is the right-hand side of the first grammar rule of the language, that initiate the other rules. • Nonterminals is a structure names that must be replaced further on the derivation. • Terminals is symbols in the alphabet that terminate the derivation. • Left recursion A AαІβ • Right recursion A αA Іβ CS302 Ch3

  9. Repetition and Recursion • Left recursion: A  A x | y • yxx: • Right recursion: A  x A | y • xxy: CS302 Ch3

  10. Parsing Algorithms • Top down • Recursive descent (hand choice) • “Predictive” table-driven, “LL” • Bottom up • “LR” and its cousin “LALR” (machine-generated choice [Yacc / Bison]) • Operator-precedence. CS302 Ch3

  11. Languages Generated by Grammars 1- G : E (E) І a L(G) = { a, (a), ((a)), (((a))), …….} derivation for the input string ((a)) E (E) ((E)) ((a)) 2- G : E (E) L(G) = { } the grammar yields no strings. 3- G : E E + a І a L(G) = { a, a +a, a + a + a, ……} derivation for the input string a + a +a E E + a E + a + a a + a + a CS302 Ch3

  12. Examples CS302 Ch3

  13. Parse Tree • A parse tree corresponding to a derivation is a labeled tree in which the interior nodes are labeled by, the leaf nodes are labeled by terminals, and the children of each internal node nonterminals represent the replacement of the associated nonterminal in one step of the derivation. • exp exp op exp number op exp number + exp number + number exp exp op exp number + number CS302 Ch3

  14. Rightmost and Leftmost Derivation Leftmost or preorder 1 exp 1 exp exp op exp 2 number op exp 3 number + exp 4 number + number exp op exp 4 2 3 number + number Rightmost or postorder 1 exp 1 exp exp op exp 2 exp op number 3 exp + number 4 number + number exp op exp 2 4 3 number + number CS302 Ch3

  15. Example A leftmost derivation (Slide 6 was a rightmost): (1) exp  exp op exp [exp  exp op exp] (2)  (exp) op exp [exp  ( exp )] (3)  (exp op exp) op exp [exp  exp op exp] (4)  (number op exp) op exp [exp  number] (5)  (number - exp) op exp [op  -] (6)  (number - number) op exp [exp  number] (7)  (number - number) * exp [op  *] (8)  (number - number) * number [exp  number] CS302 Ch3

  16. Abstract Syntax Trees • An abstracted syntax tree, or syntax tree is a tree representation of a shorthand notation for the structure of ordinary syntax. • Statement if-stmt Іother if-stmt if(exp) statement Іif (exp) statement else statement exp 0І1 • Input : if (0) other else other Parse tree Syntax tree statement If If-stmt 0 other other if ( exp ) statement else statement 0 other other CS302 Ch3

  17. Examples • G: stmt-sequence stmt ; stmt-sequence І stmt stmt s • Input string : s ; s ; s Syntax Tree Parse tree Stmt-sequence ; Stmt-sequence stmt ; s ; stmt ; Stmt-sequence s s s stmt s s CS302 Ch3

  18. Correctone Ambiguous Grammars • Parse tree s and syntax trees uniquely express the structure of syntax, as do leftmost and rightmost derivations, but not derivations in general. • A grammar that generates a string with two distinct parse trees is called ambiguous grammar. • Consider again the string number – number * number CS302 Ch3

  19. Ambiguity • Sources of Ambiguity • Associativity and precedence of operators. • Extent of a substructure (dangling else). • Dealing with ambiguity • Disambiguating rules: state a rule that specifies in each ambiguous case which of the parse trees is the correct one. • Change the grammar (but not the language): this implies changing the grammar into a form that forces the construction of the correct parse tree. CS302 Ch3

  20. Precedence and Associativity • Example: integer arithmetic expexp addop term | term addop + | - term term mulop factor | factor mulop * factor (exp ) | number exp exp addop term factor term - term mulop number factor factor * number number CS302 Ch3

  21. Dangling else Ambiguity • Example: statement  if-stmt | other if-stmt if(exp ) statement | if(exp )statement elsestatement exp 0 | 1 The following string has two parse trees: if(0) if(1) other else other CS302 Ch3

  22. Correct one Parse Trees for Dangling else Using the most closely nested disambiguity rule CS302 Ch3

  23. Changing the Grammar Rule for Dangling else Problem The grammar becomes: statement  matched-stmt | unmatched-stmt matched-stmt  if(exp ) matched-stmt else matched-stmt | other unmatched-stmt if(exp ) statement | if(exp ) matched-stmt elseunmatched-stmt exp 0 | 1 CS302 Ch3

  24. Parse Tree for the Solution Input string: if(0) if(1) other else other statement Unmatched-stmt ) exp statement if ( Matched-stmt 0 if ( ) Matched-stmt else Matched-stmt exp 1 other other CS302 Ch3

  25. Extended BNF Notation • Extended BNF (EBNF): • New metasymbols […] and {…} •  largely eliminated by these. • Repetition: A AαІβ (Left recursion) A αA Іβ (Right recursion) • This is equivalent to: A β α* A α* β • Using EBNF notation: A β {α} A {α} β CS302 Ch3

  26. Extended BNF Notation • Example: stmt-sequence stmt ; stmt-sequence І stmt • Using EBNF: stmt-sequence { stmt ; } stmt (right recursion) stmt-sequence stmt { ; stmt} (left recursion) • Optional: using previous example stmt-sequence stmt [ ; stmt-sequence ] • Example: exp exp addop term І term using EBNF: exp [exp addop ] term CS302 Ch3

  27. Syntax Diagram • Example: factor ( exp ) Іnumber • Repetition: A {B } • Optional: A [ B ] A B A B CS302 Ch3

More Related