420 likes | 584 Views
Tomita‘s Parser Tihomira Panayotova Paolina Teneva. Seminar für Sprachwissenschaft 31.01.2007. A simple overview. LR(0) conflicts Tomita‘s method summarized Complications Two optimiziations A moderately ambiguous grammar Stack duplication Combinig equal states
E N D
Tomita‘s ParserTihomira PanayotovaPaolina Teneva Seminar für Sprachwissenschaft 31.01.2007
A simple overview • LR(0) conflicts • Tomita‘s method summarized • Complications • Two optimiziations • A moderately ambiguous grammar • Stack duplication • Combinig equal states • Combinig equal stack prefixes • Discussion • Summary
LR(0) Conflicts • LR Parser: 1.handle recognizing FS automaton 2. no inadequate states • There exist grammars for which the automaton has some inadequate states
Tomita‘s method summarized • Simple definition: • A breadth-first search over those parsing decisions that are not solved by the LR automaton • It gives an efficient and very effective approach to grammars for which the automaton has some inadequate states
Tomita‘s method summarized • How does the parser act when it encounters an inadequate state on the top of the stack? Step1. It duplicates the stack and splits the parse into a different process for each copy: One copy is reserved for the REDUCE step The other copy is reserved for the SHIFT step Step2. Stacks that have a right-most state that does not allow a shift on the next input tokenareDISCARDED
Tomita‘s method summarized • SHIFT step: push a new symbol and the new state onto the stack • REDUCE step: removes part of the right end of the stack and replaces it with anon-terminal; using this non-terminal as a move in the automaton, we find a newstate to put on the top of the stack Conclusion: Every time we encounter an inadequate state on the top of the stack, the duplication process is repeated untill all reduces have been treated.
Complications • The repetition of the duplication process can cause a proliferation of stacks. A great number will be copied and subsequently discarded • If all stacks are discarded in Step2 => the input was in error • Grammars with loops : A->B B->A - the process may not terminate
Complications • Some ideas to cope with the complications: 1. Use of look-ahead to decide which reduces can be made in Step1 2.Grammar with loops: 2.1. upon creating a stack, check if it is already there (and then ignore it) 2.2 check the grammar in advance for loops (and then rejectit).
Two optimizations • Combining equal states • Combibing equal stack prefixes
A moderately ambiguous grammar SS -> E # E -> E + E E -> d Figure 9.38 A moderately ambiguous grammar
Stack Duplication a. 1 d+d+d# shift b. 1 d 2 +d+d# reduce c. 1 E 3 +d+d# shift d. 1 E 3 + 4 d+d# shift e. 1 E 3 + 4 d 2 +d# reduce f. 1 E 3 + 4 E 5 +d# duplicate to g1 and g2 g1. 1 E 3+ 4 E 5 +d# REDUCE; reduce to g1.1 g2. 1 E 3 + 4 E 5 +d# SHIFT; shift to g1.2 g1.1 1E 3 +d# shift to h1 g1.2 1 E 3 + 4 E 5 + 4 d# shift to h2 h1 1 E 3 + 4 d# shift to h1.1 h2 1 E 3 + 4 E 5 + 4 d 2 # reduce to h1.2 h1.1 1 E 3 + 4 d 2 # reduce to i h1.2 1 E 3 + 4 E 5 + 4 E 5 # duplicate to i1 and i2 i 1 E 3 + 4 E 5 # duplicate to j1 and j2
Stack Duplication i1 1 E 3 + 4 E 5 + 4 E 5 #REDUCE, reduce to k1 I2 1 E 3 + 4 E 5 + 4 E 5 #SHIFT - DISCARDED j1 1 E 3 + 4 E 5 #REDUCE, reduce to k2 j2 1 E 3 + 4 E 5 #SHIFT - DISCARDED k1 1 E 3 + 4 E 5 # reduce to l1 k2 1 E 3 # shift to l2 l1 1 E 3 # shift to m1 l2 1 E 3 # 6 reduce to m2 m1 1 E 3 # 6 reduce to n m2 1 S ACCEPT n 1 S ACCEPT
Combining equal states • Examine the following: • Both stacks have the same state on top=>further actions on both stacks will be identical • Combinethe two stacks to avoid duplicate work
Combining equal states f) 1. 1 E 3 + 4 d 2 # both 2. 1 E 3 + 4 E 5 + 4 d 2 # REDUCE to g
Combining equal states f) 1. 1 E 3 + 4 d 2 # both 2. 1 E 3 + 4 E 5 + 4 d 2 # REDUCE to g g) 1. 1 E 3 + 4 E 5 # duplicate to 2. 1 E 3 + 4 E 5 + 4 E 5 # g’ and g ’’
Combining equal states f) 1. 1 E 3 + 4 d 2 # both 2. 1 E 3 + 4 E 5 + 4 d 2 # REDUCE to g g) 1. 1 E 3 + 4 E 5 # duplicate to 2. 1 E 3 + 4 E 5 + 4 E 5 # g’ and g ’’ g’) 1. 1 E 3 + 4 E 5 # for REDUCE 2. 1 E 3 + 4 E 5 + 4 E 5 #
Combining equal states f) 1. 1 E 3 + 4 d 2 # both 2. 1 E 3 + 4 E 5 + 4 d 2 # REDUCE to g g) 1. 1 E 3 + 4 E 5 # duplicate to 2. 1 E 3 + 4 E 5 + 4 E 5 # g’ and g ’’ g’) 1. 1 E 3 + 4 E 5 # for REDUCE 2. 1 E 3 + 4 E 5 + 4 E 5 # g’’) 1. 1 E 3 + 4 E 5 # copy to h3) 2. 1 E 3 + 4 E 5 + 4 E 5 # for SHIFT
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1)
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2)
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2) h.1 ) 1 E 5 # SHIFT
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2) h.1 ) 1 E 5 # SHIFT h2 ) 1 E 3 + 4 E 5 # REDUCE to h2.1) and h2.2)
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2) h.1 ) 1 E 5 # SHIFT h2 ) 1 E 3 + 4 E 5 # REDUCE to h2.1) and h2.2) h2.1) 1 E 5 # SHIFT
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2) h.1 ) 1 E 5 # SHIFT h2 ) 1 E 3 + 4 E 5 # REDUCE to h2.1) and h2.2) h2.1) 1 E 5 # SHIFT h2.2) 1 E 3 + 4 E 5 # SHIFT
Combining equal states g’.1)1 E 3 + 4 E 5 # REDUCE to h.1) g’.2)1 E 3 + 4 E 5 + 4 E 5 # REDUCE to h.2) h.1 ) 1 E 5 # SHIFT h2 ) 1 E 3 + 4 E 5 # REDUCE to h2.1) and h2.2) h2.1) 1 E 5 # SHIFT h2.2) 1 E 3 + 4 E 5 # SHIFT h3) 1. 1 E 3 + 4 E 5 # SHIFT 2. 1 E 3 + 4 E 5 + 4 E 5
Combining equal states Now we have five stacks (h1, h2.1, h2.2, h3). • h1) and h2.1) carry state (3)on top • h2.2) and h3) carry state (5) on top h1 ) 1 E 3 # SHIFT h2 ) 1 E 3 + 4 E 5 # REDUCE to h2.1), copy to h2.2) h2.1) 1 E 3 # SHIFT h2.2) 1 E 3 + 4 E 5 # SHIFT h3) 1. 1 E 3 + 4 E 5 # SHIFT 2. 1 E 3 + 4 E 5 + 4 E 5
Combining equal states We combine the stacks with identical states on top into two bundles h’ and h’’. h’) h1)1 E 3 #copy to i) h2.1) 1 E 3
Combining equal states We combine the stacks with identical states on top into two bundles h’ and h’’. h’) h1)1 E 3 #copy to i) h2.1) 1 E 3 h’’) h3) 1. 1 E 3 + 4 E 5 2. 1 E 3 + 4 E 5 + 4 E 5 #discard h2) 1 E 3 + 4 E 5
Combining equal states i) 1) 1 E 3 3 # 6 2) 1 E 3
Combining equal states i) 1) 1 E 3 3 # 6 2) 1 E 3 i’) 1 E 3 # 6 REDUCE to j1)
Combining equal states i) 1) 1 E 3 3 # 6 2) 1 E 3 i’) 1 E 3 # 6 REDUCE to j1) i”) 1 E 3 # 6REDUCE to j2)
Combining equal states i) 1) 1 E 3 3 # 6 2) 1 E 3 i’) 1 E 3 # 6 REDUCE to j1) i”) 1 E 3 # 6REDUCE to j2) j1) 1 S ACCEPT
Combining equal states i) 1) 1 E 3 3 # 6 2) 1 E 3 i’) 1 E 3 # 6 REDUCE to j1) i”) 1 E 3 # 6REDUCE to j2) j1) 1 S ACCEPT j2) 1 S ACCEPT
Combining equal stack prefixes • When the parser makes the call for the stack to be copied,there is no actual need to copy the entire stack! • It is enough to copy the top state suffixes
Combining Equal Stack-Prefixes If we observe the example : e) 1 E 3 + 4 E 5 +d#
Combining Equal Stack-Prefixes If we observe the example : e) 1 E 3 + 4 E 5 +d# When we duplicate the stack we have two copies of It and REDUCE is applied only to one of the copies and only “so much” of the stack is copied:
Combining Equal Stack-Prefixes If we observe the example : e) 1 E 3 + 4 E 5 +d# When we duplicate the stack we have two copies of it and REDUCE is applied only to one of the copies and only “so much” of the stack is copied: e’) 1 E 3 +d# SHIFT e’’) 1 E 3 + 4 E 5 +d# SHIFT
Discussion • Table characteristics: -the method can work with every bottom-up table -the weaker the table, the more non-determinism will have to be resolvedby breadth-first search • Time requirements: - in theory – exponential - in practice - linear or slightly more than linear
SUMMARY • Breadth-first search over those parsing decisions that are not solved by theLR automaton • Important notions that should be memorized: • stack duplication (inadequate states, reduce, shift, discarded) • combining equal states • combining equal stack prefixes
References • Dick Grune & Ceriel Jacobs (1990). Parsing Techniques