170 likes | 317 Views
Parsing. Recognition of strings in a language. Graph of a Grammar. Represents leftmost derivations of a CFG. A path from node S to a node w is a leftmost derivation. S. aS. bB. a. baB. bbC. aaS. abB. bbS. …. Properties of Graph of a Grammar.
E N D
Parsing Recognition of strings in a language L7Parse
Graph of a Grammar • Represents leftmost derivations of a CFG. • A path from node S to a node w is a leftmost derivation. L7Parse
S aS bB a baB bbC aaS abB bbS … L7Parse
Properties of Graph of a Grammar • Every node has a finite number of children. • Simple breadth-first enumeration feasible. • The number of leaves is infinite if the language is infinite. • Typical case. • There can be infinite long paths (derivations). • Loops in depth-first traversals. L7Parse
(Illustrates ambiguity in the grammar.) Directed Acyclic Graph S aS Sb ab Sbb aaS aab aSb abb … L7Parse
(Illustrates ambiguous grammar with cycles.) S Cyclic structure SS SSS L7Parse
Parser A program that determines if a string by constructing a derivation. Equivalently, it searches the graph of G. • Top-down parsers • Constructs the derivation tree from root to leaves. • Leftmost derivation. • Bottom-up parsers • Constructs the derivation tree from leaves to root. • Rightmost derivation in reverse. L7Parse
S S S S S S S S S a a b S Derivation Trees Leftmost derivation L7Parse
S S S S S S S S S b a b Rightmost derivation S Derivation Trees S S S S Rightmost Derivation in Reverse S S a b L7Parse
Search the graph of a grammar breadth-first Uses: Queue (+) Always terminates with shortest derivation (-) Inefficient in general. Search the graph of a grammar depth-first Uses: Stack (-) Can get into infinite loops (e.g., left recursion) (+) Efficient in general. Top-down parsers: Breadth-first vs Depth-first L7Parse
Determining when • Number of terminals in sentential form >length of w • Prefix of sentential form preceding the leftmost non-terminal not a prefix of w. • No rules applicable to sentential form. L7Parse
Parsing Examples L7Parse
Breadth-first top-down parser Queue-up left sentential forms level by level Parse successful S A T A+T b (A) A+T+T T+T (A)+T T+T+T (T) (A+T) … A+T+T+T … (T)+T … … (b) ((A)) (b)+T … (b)+b L7Parse
Depth-first top-down parser Use stack to pursue entire path from left Parse fails S A T A+T b (A) A+T+T T+T Backtrack On failure (T) (A+T) T+T+T A+T+T+T … … … (b) ((A)) L7Parse
Summary • In BFTD version, all left derivations investigated in parallel. • In DFTD version, one specific derivation is pursued to completion. • Done, if succeeds. • Otherwise, backtrack and investigate another path. (Incomplete strategy) (Used by Prolog interpreter) L7Parse
T+b A+b A+T Bottom-up parsing (b)+b Parse successful (T)+b (b)+T (A)+b (T)+T (T)+T Not allowed (b)+A … (S)+b (A)+T … A S L7Parse
Practical Parsers • Language/Grammar designed to enable deterministic (directed and backtrack-free) searches. • Uses lookahead tokens and/or exploits the context in the sentential form constructed so far. “Look before you leap.” vs “Procrastination principle.” • Top-down parsers : LL(k) languages • E.g., Pascal, Ada, etc. • Better error diagnosis and recovery. • Bottom-up parsers : LALR(1), LR(k) languages • E.g., C/C++, Java, etc. • Handles left recursion in the grammar. • Backtracking parsers • E.g., Prolog interpreter. L7Parse