200 likes | 338 Views
Discussion #5 LL(1) Grammars &Table-Driven Parsing. Topics. Approaches to Parsing Full backtracking Deterministic Simple LL(1), table-driven parsing Improvements to simple LL(1) grammars. Prefix Expression Grammar.
E N D
Topics • Approaches to Parsing • Full backtracking • Deterministic • Simple LL(1), table-driven parsing • Improvements to simple LL(1) grammars
Prefix Expression Grammar • Consider the following grammar (which yields prefix expressions for binary operators): E N | OEE O + | | * | / N 0 | 1 | 2 | 3 | 4 • Here, prefix expressions associate an operator with the next two operands. * + 2 3 4 (* (+ 2 3) 4) (2 + 3) * 4 = 20 * 2 + 3 4 (* 2 (+ 3 4)) 2 * (3 + 4) = 14
Top-Down Parsing with Backtracking *+342 E N | OEE O + | | * | / N 0 | 1 | 2 | 3 | 4
What are the obvious problems? • We never know what production to try. • It appears to be terribly inefficient—and it is. • Are there grammars for which we can always know what rule to choose? Yes! • Characteristics: • Only single symbol look ahead • Given a non-terminal and a current symbol, we always know which production rule to apply
LL(1) Parsers • An LL parser parses the input from Left to right, and constructs a Leftmost derivation of the sentence. • An LL(k) parser uses k tokens of look-ahead. • LL(1) parsers, although fairly restrictive, are attractive because they only need to look at the current non-terminal and the next token to make their parsing decisions. • LL(1) parsers require LL(1) grammars.
Simple LL(1) Grammars For simple LL(1) grammars all rules have the form A a11 | a22 | … | ann where • ai is a terminal, 1 <= i <= n • ai aj for i j and • i is a sequence of terminals and non-terminal or is empty, 1 <= i <= n
By making all production rules of the form: A a11 | a22 | … | ann Thus, E 0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE Why is this not a simple LL(1) grammar? E N | OEE O + | | * | / N 0 | 1 | 2 | 3 | 4 How can we change it to simple LL(1)? Creating Simple LL(1) Grammars
8 7 * E E E E 6 5 3 8 + E E 4 2 * E E 3 4 4 2 3 3 ? Example: LL(1) Parsing E (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE * + 2 3 4 2 * 3 E E Success! Fail! Output = 8 6 3 4 5
Simple LL(1) Parse Table A parse table is defined as follows: (V {#}) (VT {#}) {(, i), pop, accept, error} where • is the right side of production number i • # marks the end of the input string (# V) If A (V {#}) is the symbol on top of the stack and a (VT {#}) is the current input symbol, then: ACTION(A, a) = pop if A = a for a VT accept if A = # and a = # (a, i) which means “pop, then push a and output i” (A a is the ith production) error otherwise
Parse TableE (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE VT {#} V{#} All blank entries are error
Simple LL(1):More Restrictive than Necessary • Simple LL(1) grammars are very easy and efficient to parse but also very restrictive. • The good news: we can achieve the same desirable results without being so restrictive. • How? We only need to retain the restriction that single-symbol look ahead uniquely determines which rule to use.
Relaxing Simple LL(1) Restrictions • Consider the following grammar, which is not simple LL(1): E (1)N | (2)OEE O (3)+ | (4)* N (5)0 | (6)1 | (7)2 | (8)3 • What are the problem rules? (1) & (2) • Observe that it is possible distinguish between rules 1 and 2. • N leads to {0, 1, 2, 3} • O leads to {+, *} • {0, 1, 2, 3} {+, *} = • Thus, if we see 0, 1, 2, or 3 we choose (1), and if we see + or *, we choose (2).
LL(1) Grammars • FIRST() = { | * and VT} • A grammar is LL(1) if for all rules of the form A 1 | 2 | … | n the sets FIRST(1), FIRST(2), …, and FIRST(n) are pair-wise disjoint; that is, FIRST(i) FIRST(j) = for i j
E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3 For (A, a), we select (, i) if a FIRST() and is the right hand side of rule i. VT{#} V{#}
(2)OEE (4)* (2)OEE (1)N (3)+ (1)N (1)N (8)3 (6)1 (7)2 What does 2 4 2 3 1 6 1 7 1 8 mean? E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3 E 2 4 2 3 1 6 1 7 1 8 defines a parse tree via a preorder traversal.