280 likes | 627 Views
Syntactic Analysis. Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University. Outline. Overview. Bottom-Up Parsing. LR Parsing. Examples. Front-End. Front - End Components. Group token. Scanner. Source program (text stream). identifier main. symbol (.
E N D
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline • Overview. • Bottom-Up Parsing. • LR Parsing. • Examples.
Front-End Front-End Components Group token. Scanner Source program (text stream) identifier main symbol ( m a i n ( ) { token next-token Construct parse tree. Symbol Table Parser parse-tree Check semantic/contextual. Semantic Analyzer Intermediate Representation (file or in memory)
Parsing Techniques • Top-down parsing • LL(1) grammars • Left-to-right scanning, Leftmost derivation, 1 symbol lookahead • Bottom-up parsing • LR(k) grammars • Left-to-right scanning, Rightmost derivation, k symbols lookahead
Top-Down Parsing • Recursive Decent Parser • simple top-down parser with Backtracking • Predictive Parser • non-backtracking • use FIRST and FOLLOW sets
Basic Terminologies • Sentential Form. Sentential form (left/right)
Example: Left Sentential Form m + n * k id+id*id E =>E+ T => T+ T => id+T => id+T* F => id+F* F => id+id*F => id+id*id • E ฎE + T • E ฎ T • TฎT * F • T ฎ F • Fฎ(E) • Fฎid
Example: Right Sentential Form E => E +T => E + T *F => E +T* id => E+id* id => T+id* id => F+id* id => id+id*id • E ฎE + T • E ฎ T • TฎT * F • T ฎ F • Fฎ(E) • Fฎid Rightmost derivation is also called “canonical derivation”.
Bottom-Up Parsing • Starting from the bottom of the parse tree and reduce all terminals until getting only one starting symbol. • Characteristics • Rightmost derivation in reverse. • Find the “handle” and reduce.
Rightmost Derivation in Reverse E => E +T => E + T *F => E +T* id => E+id* id => T+id* id => F+id* id => id+id*id • E ฎE + T • E ฎ T • TฎT * F • T ฎ F • Fฎ(E) • Fฎid During parsing: id+ id * id F+ id * id T+ id * id
Basic Terminologies • Handle • A substring that matches the right side of a production. • Whose reduction(with that production) will eventually lead to the starting symbol.
Example: Handle E =>E + T => E +T * F => E + T * id => E +id* id => T+id* id => F+id* id => id+id*id • E ฎE + T • E ฎ T • TฎT * F • T ฎ F • Fฎ(E) • Fฎid Handle Not a handle Note: for right-sentential form, the string on the right of a handle contains only terminals.
Shift-Reduce Parsing • shiftinput string on to the stack. • reducethe handleon the stack to a non-terminal. • try to reduce input to the starting variable.
Model of Shift-Reduce Parsing • Stack + input = current right-sentential form. • Locate the handle during the parsing: • shift zero or more input onto the stack until a handle is b on top of the stack. • Replace the handle with a proper non-terminal (Handle Pruning): • reduceb to A where A ฎ b
$id + id * id$ $id+ id * id$ $F + id* id$ $T + id* id$ $E + id* id$ $E +id* id$ $E + id* id$ $E + F* id$ $E + T* id$ $E + T *id$ $E + T * id$ $E + T * F$ $E +T $ $E $ Shift Reduce (F->id) Reduce (T->F) Reduce (E->T) Shift Shift Reduce (F->id) Reduce (T->F) Shift Shift Reduce (F->id) Reduce (T->T*F) Reduce (E->E+T) Accept Example
LR Parsing Algorithms • Use grammar to construct a parsing table. • Three techniques: • Simple LR (SLR) • Canonical LR (LR) • Look Ahead LR (LALR) • Same algorithm but different ways to construct a parsing table.
LR Parsing Tables • Two tables: action and goto. • action[sm, ai] • shift s • reduce A ฎ b • accept • error • goto[sm, Xi] = target state (If action[sm, ai] = shift s, goto[sm, ai] = s)
Example: Action Table • Example: • Input = id, state =0 • Next action = Shift • Input = +, state = 3 • Next action = Reduce • Input = $, state = 1 • Next action = Accept • Input = id, state = 1 • Next action = Error
Configuration • Stack contents and unread input: (s0 X1 s1 X2 s2 … Xm sm, ai ai+1 … an$) • This represents right-sentential form: X1 X2 … Xm ai ai+1 … an
LR Parser Movements • If action[sm, ai] = shift s, shift move: (s0 X1 s1 X2 s2 … Xm sm, ai ai+1 … an$) (s0 X1 s1 X2 s2 … Xm smai s, ai+1 … an$)
LR Parser Movements • If action[sm, ai] = reduce A ฎ b, reduce move: (s0 X1 s1 X2 s2 … Xm sm, ai ai+1 … an$) (s0 X1 s1 X2 s2 … Xm-r sm-rAs, ai ai+1 … an$) s = goto[sm-r, A] r = |b|
LR Parser Movements • If action[sm, ai] = accept, done. • If action[sm, ai] = error, error.
Example • E ฎE + T • E ฎ T • TฎT * F • T ฎ F • Fฎ(E) • Fฎid
Conflicts • Parser cannot decide: • shift/reduce conflict • can either shift or reduce. • reduce/reduce conflict • more than one production is eligible. • Usually ambiguous or non-LR grammars.
Example: Shift/Reduce Conflict stmt ฎ if expr then stmt | if expr then stmt else stmt | ... STACK INPUT $ … if expr then stmt else … $
Example: Reduce/Reduce Conflict E ฎ T | F T ฎ id F ฎ id STACK INPUT $ id… $