300 likes | 631 Views
(American Heritage Dict .). Parse : v. To break (a sentence) down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part. the dog loves the cat. ×. the loves dog the cat. ×. the cat the dog loves.
E N D
(American Heritage Dict.) Parse: v. To break (a sentence) down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part. the dog loves the cat × the loves dog the cat × the cat the dog loves IT 327
The most practical Parsers: Predictive parser: No back tracking. • input (token string) • Stacks, parsing table • output (syntax tree, intermediate codes) IT 327
Tow kinds of predictive parsers: Top-Down The syntax tree is built up from the root Example: LL(1) parser Left to right scanning Leftmost derivations 1 symbol look-ahead • Bottom-Up: • The syntax tree is built up from the leaves • Example: LR(1) parser Left to right scanning Rightmost derivations 1 symbol look-ahead IT 327
end-of-file symbol A left-most derivation LL(1) Grammar • S • ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb • S A S b • S C • A a • C c C • C LL(1) Parsing Table aaaccbbb IT 327
Recursive-descent Parser all possible terminal and end-of-file symbols • S A S b • S C • A a • C c C • C S(): Switch(token) { case a: A();S();get(b); build S ASb; break; case b: C(); build S C; break; case c: C(); built S C; break; case $: C(); built S C; break; } LL(1) Parsing Table IT 327
Recursive-descent Parser • S A S b • S C • A a • C c C • C A(): Switch(token) { case a: get(a); build A a; break; case b: error; break; case c: error; break; case $: error; break; } LL(1) Parsing Table IT 327
Recursive-descent Parser • S A S b • S C • A a • C c C • C C(): Switch(token) { case a: error; break; case b: build C ; break; case c: get(c);C(); built C cC; break; case $: build C ; break; } LL(1) Parsing Table IT 327
S A S b • S C • A a • C c C • C LL(1) Parsing aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb aaaccbbb S(); A();S();get(b); get(a);S();get(b); S();get(b); A();S();get(b);get(b); get(a);S();get(b);get(b); S();get(b);get(b); A();S();get(b);get(b);get(b); get(a);S();get(b);get(b);get(b); S();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(b);get(b);get(b); get(b);get(b); get(b); • S • ASb • aSb • aASbb • aaSbb • aaASbbb • aaaSbbb • aaaCbbb • aaacCbbb • aaaccCbbb • aaaccbbb IT 327
LL(1) Grammar A grammar having an LL(1) parsing table.i.e., There is no conflict in the parsing table • S A S b • S C • A a • C c C • C LL(1) Parsing Table LL(1) Grammars allow -production. IT 327
Not every CFG is an LL(1) grammar <stmt> ::= <if-stmt> | s1 | s2 <if-stmt> ::= if <expr> then <stmt> else <stmt> | if <expr> then <stmt> <expr> ::= e1 | e2 if e1 thenif e2 then s1 else s2 if (a > 2) if (b > 1) b++; else a++; if (a > 2) if (b > 1) b++; else a++; IT 327
The recursive-descent parser does not work for every CFG • E E + T • ET • T T * F • T F • F ( E ) • F id E(): Switch(token) { case id: E(); ... ... ... } id+id*id Left-recursions IT 327
A left-recursive grammar Remove left-recursion Left-recursions • A A’ • A’ A’ • A’ • A A • A A A A’ A A’ A A’ A A’ IT 327
Eliminating left-recursions • E T E’ • E’ + T E’ • E’ • T F T’ • T’ * F T’ • T’ • F ( E ) • F id • E E + T • ET • T T * F • T F • F ( E ) • F id IT 327
An Algorithm for Eliminating immediate left-recursions Given a CFG G, let A be one of its non-terminal symbols such that A A A 1. Add a new non-terminal symbol A’ to G; 2. For each production A such that A is not the 1st symbol in add A A’ to G; 3. For each production A A replace it by A A’; 4. Add A’ to G; • A A’ • A’ A’ • A’ IT 327
Indirect left-recursions S a • S A a • S b • A S d • A e A d S a A d S b bdada IT 327
repeat Indirect left-recursions if any, remove the last non-terminal symbol Z with rule ZX… find all immediate left recursions find all immediate left recursions • S A a • S b • ASdA’ • A eA’ • A’ cA’ • A’ • S eA’aS’ • S bS’ • S’ dA’aS’ • S’ • A’ cA’ • A’ • S A a • S b • A A c • A S d • A e • S SdA’ a • S eA’a • S b • A’ cA’ • A’ A A’ A’ A’ A’ A A A IT 327
An Algorithm for Eliminating left-recursions Given a CFG G, let A1, A2, ..... An, be its nonterminal symbols for i:= n down to 1 do { for j := 1 to i-1 do { // find one level of indiretion For each production Ai Aj ωdo { For each production Aj ,add Ai ω to the grammar; Remove Ai Ajωby } } // end for j Eliminate the immediate left-recursion caused by Ai } // end for i IT 327
A Grammar for if statements • S iCtSE • S a • E eS • E • C b Is it an LL(1) grammar? Is there an LL(1) parsing table for it? No! IT 327
ibtibtae…… A Grammar for if statements • S iCtSE • S a • E eS • E • C b Why there is a conflict? • S ... • i b t S E… ... • i b t ibtSEE… • i b t ibtaEE… • i b t ibta E… • i b t ibta eS… • S ... • i b t S E… ... • i b t ibtSEE… • i b t ibtaEE… • i b t ibtaeSE… 4: 3: IT 327
A Grammar for if statements • S iCtSE • S a • E eS • E • C b Can we have an unambiguous equivalent grammar for this grammar? Yes! In general, No! Some inherently ambiguous languages exist. Can we write a program to test whether a given grammar is ambiguous? No! Can we write a program to get an unambiguous equivalent grammar from any grammar of a language that is known to be not inherently ambiguous? No! IT 327
Is there an LL(2) Grammar ? Yes! We need to look two symbols ahead in order to determine which rule should be used. { ambnc | m ≥ 1 and n ≥ 0 } • S A B • A aA • A a • B b B • B c LL(2) Parsing Table a a a a a b b b b c IT 327
LL(2) Parsing Table • S A B • A aA • A a • B b B • B c LL(2) Parsing S(); a a a b c A();B(); a a a b c get(a);A();B(); a a a b c A();B(); a a b c get(a);A();B(); a a b c A();B(); a b c get(a);B() a b c B(); b c get(b);B(); b c B(); c get(c); c IT 327
Is there an LL(1) grammar equivalent to the following LL(2) grammar? Yes { ambnc | m ≥ 1 and n ≥ 0 } • S a A B • A aA • A • B b B • B c • S A B • A aA • A a • B b B • B c a a a a a b b b b c IT 327
Every left-recursive grammar is not an LL(k) grammar But we can effectively find an equivalent one • S aS’ • S’ AS’ • S’ • A b • S S A • S a • A b • S • SA • SAA • SAAA • SAAAA • aAAAAA • .... • abbbbbb • E T E’ • E’ + T E’ • E’ • T F T’ • T’ * F T’ • T’ • F ( E ) • F id • E E + T • ET • T T * F • T F • F ( E ) • F id Are we happy with this? IT 327
Does any LL(2) grammar always has an equivalent LL(1) grammar? No LL(2) grammar LL(k) grammar, k 2 • S a SA • S • A ak-1bS • A c • S a SA • S • A abS • A c no equivalent LL(k-1) grammar no equivalent LL(1) grammar KuriKi-Sunoi [1969] LL(1) LL(2) LL(3) ..... LL(k) LL(k+1) ... IT 327
LL(k) grammar, k 2 (KuriKi-Sunoi [1969]) There exists DCFL that is not LL(k) • S a SA • S • A ak-1bS • A c This grammar is inherently ambiguous. Is there an unambiguous CFG that is not an LL(k) grammar? Yes -- Stearns [1970] { an | n ≥ 0 } { anbn | n ≥ 0 } IT 327
LL(1) Parser Implementation • E T E’ • E’ + T E’ • E’ • T F T’ • T’ * F T’ • T’ • F ( E ) • F n • p.s. Let n be any positive integer less than 32767 Programming Assignment Details will be announced later. IT 327