130 likes | 392 Views
Lecture 13 Parsing and Ambiguity. Given a string x and a CFG G = (V, Σ , R, S) , determine whether x L(G) and if x L(G) , find a derivation S * x . This problem is called Parsing . To solve the parsing problem, we first study the parse tree .
E N D
Given a string x and a CFG G = (V, Σ, R, S), determine whether xL(G) and if xL(G), find a derivation S * x. This problem is called Parsing. To solve the parsing problem, we first study the parse tree.
The parse tree is the graph representation of a derivation, which can be defined in the following way: • A vertex with a label which is a nondeterminal • symbol is a parse tree. (2) If A → y1y2 … yn is a rule in R, then the tree A y2 y1 . . . yn is a parse tree.
(3) If A → ε is a rule in R, then A ε is a parse tree. (4) If a parse tree has a leaf which is the root of another parse tree, then their union is a parse tree. (5) Nothing else is a parse tree.
Each derivation has a parse tree. Consider CFG G = ({S}, {a, b, c}, R, S) where R = {S → SbS | ScS | a}. The derivation S SbS SbScS abScS abSca abaca has the following parse tree. S S S b c S S a a a
But, a parse tree may be owned by several derivations. For example, the derivation S SbS SbScS SbSca abSca abaca Has the same parse tree as above.
Leftmost derivation A derivation S * y is called a leftmost derivation and write S * y if y is obtained from S by a sequence of steps at each of which apply a rule to the leftmost nonterminal symbol. left left S SbS abS abScS abacS abaca Each parse tree uniquely corresponds exactly one leftmost derivation.
The parse tree for S * x in L(G) has at least |x| leaves; their concatenation is x.
ambiguous A string x in L(G) may have two or more parse tree witness S * x. The grammar G is said to be ambiguous if such a case exists. CFG G = ({S}, {a, b, c}, R, S) where R = {S → SbS | ScS | a} is ambiguous because abaca has two parse trees. S S S c S b S S S b S a c S a S a a a a
How to remove ambiguity is an important issue in theory of compiler. However, determine whether a CFG is ambiguous is undecidable. CFG G = ({S, A}, {0,1}, R, S) where R = {S → A00, A → ε | AA | 0 | 1} is ambiguous because 00 has two parse trees: S S A 0 0 A 0 0 A A ε εε
The ambiguity for this CFG can be removed by removing rule A → ε . CFG G = ({S, A}, {0,1}, R, S) where R = {S → 00 | A00, A → AA | 0 | 1}
Parsing Algorithm A string w in (V U Σ)* is a left sentential form if S * w. left The leftmost graph g(G) for CFG G is defined as follows: (a) vertex set = the set of all left sentential forms (b) there exists directed edge (x, y) if x y. left Usually, g(G) is an infinite digraph.
If no rule in form A → ε exists, then g(G) is nondecreasing and hence a depth-first search or breath-first search would solve the parsing problem.