190 likes | 469 Views
By Georgi Boychev, Rafal Kala, Ildus Mukhametov. Top-down Parsing. Outline. Introduction. Directional top-down parsing. Imitating leftmost derivations. The pushdown automaton. Breadth-first top-down parsing. Depth-first (backtracking) parsers. Conclusion. Introduction.
E N D
By Georgi Boychev, Rafal Kala, Ildus Mukhametov Top-down Parsing
Outline • Introduction. • Directional top-down parsing. • Imitating leftmost derivations. • The pushdown automaton. • Breadth-first top-down parsing. • Depth-first (backtracking) parsers. • Conclusion.
Introduction • Parsers can check whether a word matches a certain grammar, and provide one or more syntactic analyses. • There are two basic types: • Top-down parsing • Directional (goes from left to right). • Non-directional. • Bottom-up parsing
Today we will discuss directional top-down parsing. Introduction
Directional Top-down Parsing • Begin with the start symbol S. • Apply productions until we arrive at the input string. • We draw the prediction right under the part of the input it predicts.
The grammar form consists of both terminals and non-terminals. If a terminal symbol is in front, we match it with the current input symbol, if non-terminal is in front, we pick one of its right-hand sides. This way we all the time replace leftmost non-terminal, and in the end, if we succeed, we have imitated a leftmost derivation. Imitating Leftmost derivations
This is our grammar: Example Input sentence is aabb.
We try to rederive the input aabb from the start symbol S. The first symbol of our prediction is non-terminal, so we have to replace it by one of its right-hand sides. S → aB | bA We apply the first option, because the terminals match. Now we have to parse abb, and we match terminals again.. B → b | bS | aBB Example ctd.
We're now left with BB for bb. B → b | bS | aBB Then we have to replace leftmost B by one of its choices (B → b). In the end we receive the following derivation: S → aB → aaBB → aabB → aabb Example ctd.
Push-down automaton. • A stack is FILO list. The PDA operates by popping the stack (that contains stack alphabet) and reading an input symbol. • These two symbols give us a choice of several lists of stack symbols to be pushed back on the stack. • So there is a mapping of (input symbol, stack symbol) pairs to lists of stack symbols. The automaton accepts the input sentence when the stack is empty at the end of the input.
Example • Grammar: • Input: aabb • PDA:
Breadth-first Top-Down Parsing • Two different strategies to go through decision tree – breadth-first and depth-first. • In breadth-first we maintain a list of all possible predictions. • We process it in the following way: • If there's non-terminal on top, we replace the prediction stack by several new predictions stacks, depending on the choices for this non-terminal • If we have a terminal, we can eliminate all the prediction stacks that do not match.
Example • Grammar: S → AB | DC A → a | aA B → bc | bBc D → ab | aDb C → c | cC • Input: aabc
Depth-first (Backtracking) Parsers • The breadth-first method uses too much memory, because it stores a list of all possible predictions. • The depth-first method doesn't have this problem because we look at only one path at a time. • Firstly we examine the path, if it turns out to be a failure, we roll back our actions and continue with other possibilities.
Backtracking • Sometimes we have multiple right-hand sides and we have to choose one. • But if we choose the wrong one, we come to a dead end. • So, we have to go back to the point where we made the choice, and try an alternative path. • We do this until we succeed, or run out of choices.
Example Backtracking over a terminal is done by moving a vertical line backwards.
Conclusion • We always process the leftmost symbol of the prediction. • If this symbol is a terminal, we have no choice: we have to match it with the current input symbol or reject the parse. • If this symbol is a non-terminal, we have to make a prediction, it has to be replaced by one of its right-hand sides. Thus, we always process the leftmost non-terminal first, so we get a leftmost derivation. • As a result, a top-down method recognizes the nodes of the parse tree in pre-order: the parent is identified before any of its children.