1 / 20

Discussion #3 Grammar Formalization & Parse-Tree Construction

Discussion #3 Grammar Formalization & Parse-Tree Construction. Topics. Grammar Definitions Parse Trees Constructing Parse Trees. Formal Definition of a Grammar. A grammar G is a 4-tuple: G = (V N , V T , S, ), where

apria
Download Presentation

Discussion #3 Grammar Formalization & Parse-Tree Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discussion #3Grammar Formalization& Parse-Tree Construction

  2. Topics • Grammar Definitions • Parse Trees • Constructing Parse Trees

  3. Formal Definition of a Grammar A grammar G is a 4-tuple: G = (VN, VT, S, ), where • VN , VT , sets of non-terminal and terminal symbols • SVN, a start symbol •  = a finite set of relations from (VT  VN)+ to (VT  VN)* • an element of , (, ), is written as    and is called a production rule or a rewriting rule

  4. Examples of Grammars

  5. Definition of a Context-Free Grammar • A context-free grammar is a grammar with the following restriction: • The relation  is a finite set of relations from VN to (VT  VN)+ • i.e. the left hand side of a production is a single non-terminal • i.e. the right hand side of any production cannot be empty • Context-free grammars generate context-free languages. With slight variations, essentially all programming languages are context-free languages.

  6. Examples of Grammars (again) Which are context-free grammars?

  7. Backus-Naur Form (BNF) • A traditional meta language to represent grammars for programming languages • Every non-terminal is enclosed in < and > • Instead of the symbol  we use ::= • Example • I  L | ID | IL • L  a | b | … | z • D  0 | 1 | … | 9 • BNF: • <I> ::= <L> | <I><D> | <I><L> • <L> ::= a | b | … | z • <D> ::= 0 | 1 | … | 9

  8. Definition: Direct Derivative Let G = (VN, VT, S, ) be a grammar and ,   (VN  VT)*,  is said to be a direct derivative of , (written   ) if there are strings 1 and 2 (including possibly empty strings) such that  = 1B2,  = 12, B  VN and B   is a production of G.

  9. Example: Direct Derivatives G = (VN, VT, S, ), where: VN = {I, L, D} VT = {a, b, …, z, 0, 1, …, 9} S = I  = { I  L | ID | IL L  a | b | … | z D  0 | 1 | … | 9 }

  10. Definition: Derivation Let G = (VN, VT, S, ) be a grammar A string  produces  ( reduces to  or  is the derivation of , written  + ), if there are strings 0, 1, …, n (n>0) such that  = 0  1, 1  2, …, n-1  n, n  .

  11. Example: Derivation • LetG = (VN, VT, S, ), where: VN = {I, L, D} VT = {a, b, …, z, 0, 1, …, 9} S = I  = { I  L | ID | IL L  a | b | … | z D  0 | 1 | … | 9 } • I produces abc12 I  ID  IDD  ILDD  ILLDD  LLLDD  aLLDD  abLDD  abcDD  abc1D  abc12

  12. Definition: Language • A sentential form is any derivative of the start symbol S. • A language L generated by a grammar G is the set of all sentential forms whose symbols are all terminals; that is, L(G) = { | S +  and   VT*}

  13. Example: Language • LetG = (VN, VT, S, ), where: VN = {I, L, D} VT = {a, b, …, z, 0, 1, …, 9} S = I  = { I  L | ID | IL L  a | b | … | z D  0 | 1 | … | 9 } • I produces abc12 • L(G) = {abc12, x, m934897773645, a1b2c3, …} I  ID  IDD  ILDD  ILLDD  LLLDD  aLLDD  abLDD  abcDD  abc1D  abc12

  14. Syntax Analysis: Parsing • The parse of a sentence is the construction of a derivation for that sentence • The parsing of a sentence results in • acceptance or rejection • and, if acceptance, then also a parse tree • We are looking for an algorithm to parse a sentence (i.e. to parse a program) and produce a parse tree.

  15. Parse Trees • A parse tree is composed of • interior nodes representing syntactic categories (non-terminal symbols) • leaf nodes representing terminal symbols • For each interior node N, the transition from N to its children represents the application of a production.

  16. Parse Tree Construction • Top-down • Starts with the root (starting symbol) • Proceeds downward to leaves using productions • Bottom-up • Starts from leaves • Proceeds upward to the root • Although these seem like reasonable approaches to develop a parsing algorithm, we’ll see that neither works well  so we’ll need to find a better way.

  17. E * E E + E D D D 4 2 3 Example: Top-Down Parse for 4 * 2 + 3 • VN = {E, D} • VT = {0, 1, …, 9, +, , *, /, (, )} • S = E • = { E  D | ( E ) | E + E| E – E | E * E| E / E , • D  0 | 1 | … | 9 } E • Problems: • How do we guess • which rule applies? • Note that we produced • the wrong parse tree • (precedence is wrong)

  18. E E E + E E E * D E + E E * E D 3 D D 4 D D 4 2 2 3 Ambiguous GrammarTwo Different Parse Trees for 4*2+3 • = { E  D | ( E ) | E + E| E – E | E * E| E / E , D  0 | 1 | … | 9 }

  19. A ( A + A ) ( ( A * A ) + A ) ( ( A * ( A + A ) ) + I ) ( ( V * ( V + V ) ) + I D) Problem: I ?? D ( ( L * ( L + L ) ) + D D) Example: Bottom-Up Parse Problem: scanning the entire program repeatedly • A  V | I | (A + A) | (A * A) • V  L | VL | VD • I  D | ID • D  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • L  x | y | z ( ( z * ( x + y ) ) + 1 2 )

  20. So, how do we develop a parsing algorithm? • “Fix” the grammar • So that we can go top down, left to right, with no backup • LL(1) grammar: Left-to-right, Left-most non-terminal, one symbol look ahead • “Fix” (How?) • Observe grammar properties: determine what’s needed to make them LL(1) • Transform grammars to make them LL(1) • Note: works for many grammars, but not all

More Related