1 / 28

The most practical Parser:

Learn about two practical parser techniques: Top-Down and Bottom-Up, including LL(1) parsing, left-most derivations, LR(1) parsing, and syntax tree construction.

pantonio
Download Presentation

The most practical Parser:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The most practical Parser: Predictive parser: No back tracking. • Input (token string) • Stacks, parsing table • Output (syntax tree, intermediate codes) IT 327

  2. Tow kinds of predictive parsers: Top-Down The syntax tree is built up from the root Example: LL(1) parser Left to right scanning Leftmost derivations 1 symbol look-ahead Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Left to right scanning Rightmost derivations 1 symbol look-ahead IT 327

  3. A left-most derivation S  ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb • S A S b • S  C • A  a • C  c C • C   Empty string aaaccbbb IT 327

  4. end-of-file symbol A left-most derivation LL(1) Grammar S  ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb • S A S b • S  C • A  a • C  c C • C   LL(1) Parsing Table Left to right scanning Rightmost derivations 1 symbol look-ahead aaaccbbb IT 327

  5. S A S b • S  C • A  a • C  c C • C   LL(1) Parsing aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb S(); A();S();get(b); get(a);S();get(b); S();get(b); A();S();get(b);get(b); get(a);S();get(b);get(b); S();get(b);get(b); A();S();get(b);get(b);get(b); get(a);S();get(b);get(b);get(b); S();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(b);get(b);get(b); get(b);get(b); get(b); S  ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb IT 327

  6. Recursive-descent Parser all possible terminal and end-of-file symbols • S A S b • S  C • A  a • C  c C • C   S(): Switch(token) { case a: A();S();get(b); build S ASb; break; case b: C(); build S C; break; case c: C(); built S C; break; case $: C(); built S C; break; } LL(1) Parsing Table IT 327

  7. Recursive-descent Parser • S A S b • S  C • A  a • C  c C • C   A(): Switch(token) { case a: get(a); build A a; break; case b: error; break; case c: error; break; case $: error; break; } LL(1) Parsing Table IT 327

  8. Recursive-descent Parser • S A S b • S  C • A  a • C  c C • C   C(): Switch(token) { case a: error; break; case b: build C ; break; case c: get(c);C(); built C cC; break; case $: build C ; break; } LL(1) Parsing Table IT 327

  9. S A S b • S  C • A  a • C  c C • C   LL(1) Parsing aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb S(); A();S();get(b); get(a);S();get(b); S();get(b); A();S();get(b);get(b); get(a);S();get(b);get(b); S();get(b);get(b); A();S();get(b);get(b);get(b); get(a);S();get(b);get(b);get(b); S();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(b);get(b);get(b); get(b);get(b); get(b); S  ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb IT 327

  10. LL(1) Parser Implementation • E T E’ • E’ + T E’ • E’  • T  F T’ • T’  * F T’ • T’  • F  ( E ) • F  n p.s. Let n be any positive integer less than 32767 Programming Assignmen IT 327

  11. LL(1) Grammar A grammar having an LL(1) parsing table.i.e., There is no conflict in the parsing table • S A S b • S  C • A  a • C  c C • C   LL(1) Parsing Table LL(1) Grammars allow -production. IT 327

  12. Is the following grammar LL(1) grammar? <stmt> ::= <if-stmt> | s1 | s2 <if-stmt> ::= if <expr> then <stmt> else <stmt> | if <expr> then <stmt> <expr> ::= e1 | e2 if e1 thenif e2 then s1 else s2 if (a > 2) if (b > 1) b++; else a++; if (a > 2) if (b > 1) b++; else a++; IT 327

  13. Not every CFG is an LL(1) grammar (reasons?) The recursive-descent parser does not work for every CFG • E E + T • ET • T  T * F • T  F • F  ( E ) • F  id E(): Switch(token) { case id: E(); ... ... ... } id+id*id Left-recursions IT 327

  14. A left-recursive grammar Remove left-recursion Left-recursions • A  A’ • A’   A’ • A’ • A A  • A A A   A’ A  A’  A   A’ A       A’  IT 327

  15. Eliminating left-recursions   • E T E’ • E’ + T E’ • E’  • T  F T’ • T’  * F T’ • T’  • F  ( E ) • F  id • E E + T • ET • T  T * F • T  F • F  ( E ) • F  id   IT 327

  16. An Algorithm for Eliminating immediate left-recursions Given a CFG G, let A be one of its non-terminal symbols such that A A  A 1. Add a new non-terminal symbol A’ to G; 2. For each production A  such that A is not the 1st symbol in  add A  A’ to G; 3. For each production A A replace it by A’  A’; 4. Add A’ to G; • A  A’ • A’   A’ • A’ IT 327

  17. Indirect left-recursions S a • S A a • S b • A  S d • A  e A d S a A d S b bdada IT 327

  18. repeat Indirect left-recursions if any, remove the last non-terminal symbol Z with rule ZX… find all immediate left recursions find all immediate left recursions  • S A a • S b • ASdA’ • A eA’ • A’ cA’ • A’  • S eA’aS’ • S bS’ • S’ dA’aS’ • S’  • A’ cA’ • A’  • S A a • S b • A  A c • A  S d • A  e • S SdA’ a • S eA’a • S b • A’ cA’ • A’     A  A’ A’   A’ A’ A A  A  IT 327

  19. An Algorithm for Eliminating left-recursions Given a CFG G, let A1, A2, ..... An, be its nonterminal symbols for i:= n down to 1 do { for j := 1 to i-1 do { // find one level of indiretion For each production Ai Aj ωdo { For each production Aj ,add Ai ω to the grammar; Remove Ai Ajωby } } // end for j Eliminate the immediate left-recursion caused by Ai } // end for i Skip this slide IT 327

  20. A Grammar for if statements • S iCtSE • S a • E  eS • E  • C  b Is it an LL(1) grammar? Is there an LL(1) parsing table for it? No! IT 327

  21. 4 ibtibtae…… A Grammar for if statements 3 • S iCtSE • S a • E  eS • E  • C  b Why there is a conflict? S ... • i b t S E… ... • i b t ibtSEE… • i b t ibtaEE… • i b t ibta E… • i b t ibta eS… S ... • i b t S E… ... • i b t ibtSEE… • i b t ibtaEE… • i b t ibtaeSE… 4: 3: IT 327

  22. A Grammar for if statements • S iCtSE • S a • E  eS • E  • C  b Can we have an unambiguous equivalent grammar for this grammar? Yes! But in general, inherently ambiguous languages exist. No! Can we write a program to test whether a given grammar is ambiguous? Can we write a program to get an unambiguous equivalent grammar from any grammar of a language that is known to be not inherently ambiguous? No! IT 327

  23. Is there an LL(2) Grammar ? Yes! We need to look two symbols ahead in order to determine which rule should be used. { ambnc | m ≥ 1 and n ≥ 0 } • S A B • A  aA • A  a • B  b B • B  c LL(2) Parsing Table a a a a a b b b b c IT 327

  24. LL(2) Parsing Table • S A B • A  aA • A  a • B  b B • B  c LL(2) Parsing S(); a a a b c A();B(); a a a b c get(a);A();B(); a a a b c A();B(); a a b c get(a);A();B(); a a b c A();B(); a b c get(a);B() a b c B(); b c get(b);B(); b c B(); c get(c); c IT 327

  25. Is there an LL(1) grammar equivalent to the following LL(2) grammar? Yes { ambnc | m ≥ 1 and n ≥ 0 } • S a A B • A  aA • A   • B  b B • B  c • S A B • A  aA • A  a • B  b B • B  c a a a a a b b b b c IT 327

  26. Every left-recursive grammar is not an LL(k) grammar But we can effectively find an equivalent one • S aS’ • S’  AS’ • S’   • A  b • S S A • S  a • A  b S • SA • SAA • SAAA • SAAAA • aAAAAA • .... • abbbbbb • E T E’ • E’ + T E’ • E’  • T  F T’ • T’  * F T’ • T’  • F  ( E ) • F  id • E E + T • ET • T  T * F • T  F • F  ( E ) • F  id Are we happy with this? IT 327

  27. Does any LL(2) grammar always has an equivalent LL(1) grammar? No LL(2) grammar LL(k) grammar, k  2 • S a SA • S   • A  ak-1bS • A  c • S a SA • S   • A  abS • A  c no equivalent LL(k-1) grammar no equivalent LL(1) grammar KuriKi-Sunoi [1969] LL(1)  LL(2)  LL(3)  ..... LL(k)  LL(k+1)  ... IT 327

  28. LL(k) grammar, k  2 (KuriKi-Sunoi [1969]) There exists DCFL that is not LL(k) • S a SA • S   • A  ak-1bS • A  c This grammar is inherently ambiguous. Is there an unambiguous CFG that is not an LL(k) grammar? Yes -- Stearns [1970] { an | n ≥ 0 }  { anbn | n ≥ 0 } IT 327

More Related