1 / 30

Top-Down Parsing

Top-Down Parsing. Identify a leftmost derivation for an input string Why ? By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input.

gisela
Download Presentation

Top-Down Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Top-Down Parsing • Identify a leftmost derivation for an input string • Why ? • By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input. • A  aBc  adDc  adec (scan a, scan d, scan e, scan c - accept!) • Recursive-descent parsing concepts • Predictive parsing • Recursive / Brute force technique • non-recursive / table driven • Error recovery • Implementation

  2. Top-Down Parsing • From Grammar to Parser, take I

  3. Recursive Descent Parsing S S cad cad c d A c d A a b Problem: backtrack S S cad cad c d A c d A a b a • General category of Parsing Top-Down • Choose production rule based on input symbol • May require backtracking to correct a wrong choice. • Example: S  c A d • A  ab | a input: cad S cad c d A a

  4. Top-Down Parsing • From Grammar to Parser, take II

  5. Predictive Parsing • Backtracking is bad! • To eliminate backtracking, what must we do/be sure of for grammar? • no left recursion • apply left factoring • (frequently) when grammar satisfies above conditions:current input symbol in conjunction with current non-terminal uniquely determines the production that needs to be applied. • Utilize transition diagrams: • For each non-terminal of the grammar do following: • 1. Create an initial and final state • 2. If A X1X2…Xn is a production, add path with edges X1, X2, … , Xn • Once transition diagrams have been developed, apply a straightforward technique to algorithmicize transition diagrams with procedure and possible recursion.

  6. Transition Diagrams F  ( E ) | id E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F T T’ E’ T: E: 7 0 8 1 9 2 + T E’ E’: 3 5 4 6  ( * F E ) T’ F: T’: 10 14 11 15 12 16 13 17  id • Unlike lexical equivalents, each edge represents a token • Transition implies: if token, match input else call proc • Recall earlier grammar and its associated transition diagrams How are transition diagrams used ? Are -moves a problem ? Can we simplify transition diagrams ? Why is simplification critical ?

  7. How are Transition Diagrams Used ? main() { TD_E(); } TD_E’() { token = get_token(); if token = ‘+’ then { TD_T(); TD_E’(); } } What happened to -moves? … “else unget()and terminate” NOTE: not all error conditions have been represented. TD_F() { token = get_token(); if token = ‘(’ then { TD_E(); match(‘)’); } else if token.value <> id then {error + EXIT} else ... } TD_E() { TD_T(); TD_E’(); } TD_T() { TD_F(); TD_T’(); } TD_E’() { token = get_token(); if token = ‘*’ then { TD_F(); TD_T’(); } }

  8. How can Transition Diagrams be Simplified ? + E’ E’: 3 5  T 4 6

  9. How can Transition Diagrams be Simplified ? (2) + E’ E’: 3 5   + E’: 3 5  T T 4 4 6 6

  10. How can Transition Diagrams be Simplified ? (3) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T T 4 4 6 6 6

  11. How can Transition Diagrams be Simplified ? (4) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T E’ E: 0 1 2 T T 4 4 6 6 6

  12. How can Transition Diagrams be Simplified ? (5) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T T E’ E: E: 0 0 1 2 T T 4 4 6 6 6 6 T + 3 4 

  13. Additional Transition Diagram Simplifications *  10 13 F F T: 7 * T’: 10 11  13 ( E ) F: 14 15 16 17 id • Similar steps for T and T’ • Simplified Transition diagrams: Why is simplification important ? How does code change?

  14. Top-Down Parsing • From Grammar to Parser, take III

  15. Motivating Table-Driven Parsing 1. Left to right scan input 2. Find leftmost derivation Terminator Grammar: E  TE’ E’  +TE’ |  T  id Input : id + id $ Derivation: E  Processing Stack:

  16. Non-Recursive / Table Driven Input (String + terminator) Predictive Parsing Program Stack a + b $ Output NT + T symbols of CFG What actions parser should take based on stack / input Parsing Table M[A,a] X Y Z $ Empty stack symbol • General parser behavior: X : top of stack a : current input • 1. When X=a = $ halt, accept, success • 2. When X=a  $ , POP X off stack, advance input, go to 1. • 3. When X is a non-terminal, examine M[X,a] • if it is an error  call recovery routine • if M[X,a] = {X  UVW}, POP X, PUSH W,V,U • DO NOT expend any input

  17. Algorithm for Non-Recursive Parsing Set ip to point to the first symbol of w$; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is terminal or $ then if X=a then pop X from the stack and advance ip else error() else /* X is a non-terminal */ if M[X,a] = XY1Y2…Ykthen begin pop X from stack; push Yk, Yk-1, … , Y1 onto stack, with Y1 on top output the production XY1Y2…Yk end else error() until X=$ /* stack is empty */ Input pointer May also execute other code based on the production used

  18. Example E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id INPUT SYMBOL Non-terminal id + * ( ) $ E ETE’ ETE’ E’ E’+TE’ E’ E’ T TFT’ TFT’ T’ T’ T’*FT’ T’ T’ F Fid F(E) Our well-worn example ! Table M

  19. Trace of Example STACK INPUT OUTPUT

  20. Trace of Example STACK INPUT OUTPUT $E $E’T $E’T’F $E’T’id $E’T’ $E’ $E’T+ $E’T $E’T’F $E’T’id $E’T’ $E’T’F* $E’T’F $E’T’id $E’T’ $E’ $ id + id * id$ id + id * id$ id + id * id$ id + id * id$ + id * id$ + id * id$ + id * id$ id * id$ id * id$ id * id$ * id$ * id$ id$ id$ $ $ $ E TE’ T FT’ F  id T’   E’  +TE’ T FT’ F  id T’  *FT’ F  id T’   E’   Expend Input

  21. Leftmost Derivation for the Example The leftmost derivation for the example is as follows: E  TE’  FT’E’  id T’E’  id E’  id + TE’  id + FT’E’  id + id T’E’  id + id * FT’E’  id + id * id T’E’  id + id * id E’  id + id * id

  22. What’s the Missing Puzzle Piece ? Constructing the Parsing Table M ! 1st : Calculate First & Follow for Grammar 2nd: Apply Construction Algorithm for Parsing Table ( We’ll see this shortly ) Basic Tools: First:Let  be a string of grammar symbols. First() is the set that includes every terminal that appears leftmost in  or in any string originating from . NOTE: If   , then  is First( ). Follow: Let A be a non-terminal. Follow(A) is the set of terminals a that can appear directly to the right of A in some sentential form. (S  Aa, for some  and ). NOTE: If S  A, then $ is Follow(A). * * *

  23. Motivation Behind First & Follow Is used to help find the appropriate reduction to follow given the top-of-the-stack non-terminal and the current input symbol. First: Example: If A   , and a is in First(), then when a=input, replace A with  (in the stack). ( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH . Follow: Is used when First has a conflict, to resolve choices, or when First gives no suggestion. When    or   , then what follows A dictates the next choice to be made. * Example: If A   , and b is in Follow(A ), then when   and b is an input character, then we expand A with  , which will eventually expand to , of which b follows! (   : i.e., First( ) contains .) * *

  24. An example. STACK INPUT OUTPUT $S abbd$ S  aB C d B  CB | |S a C  b

  25. Computing First(X) : All Grammar Symbols • 1. If X is a terminal, First(X) = {X} • 2. If X  is a production rule, add  to First(X) • 3. If X is a non-terminal, and X Y1Y2…Yk is a production rule • Place First(Y1) in First(X) • if Y1 , Place First(Y2) in First(X) • if Y2  , Place First(Y3) in First(X) • … • if Yk-1  , Place First(Yk) in First(X) • NOTE: As soon as Yi   , Stop. • Repeat above steps until no more elements are added to any First( ) set. • Checking “Yj   ?”essentially amounts to checking whether  belongs to First(Yj) * * * * *

  26. Computing First(X) : All Grammar Symbols - continued • Informally, suppose we want to compute • First(X1 X2 … Xn ) = First (X1) “+” • First(X2) if  is in First(X1) “+” • First(X3) if  is in First(X2) “+” • … • First(Xn) if  is in First(Xn-1) Note 1: Only add  to First(X1 X2 … Xn) if  is in First(Xi) for all i Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !

  27. Example 1 Given the production rules: S  i E t SS’ | a S’  eS |  E  b

  28. Example 1 Given the production rules: S  i E t SS’ | a S’  eS |  E  b Verify that First(S) = { i, a } First(S’) = { e,  } First(E) = { b }

  29. Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id Computing First for:

  30. Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id Overall: First(E) = { ( , id } = First(F) First(E’) = { + ,  } First(T’) = { * ,  } First(T)  First(F) = { ( , id } Computing First for: First(TE’) First(T) “+” First(E’) First(E) * Not First(E’) since T   First(T) First(F) “+” First(T’) First(F) * Not First(T’) since F   First((E)) “+” First(id) “(“ and “id”

More Related