490 likes | 670 Views
MA/CSSE 474 Theory of Computation. Pushdown Automata (PDA) Intro. Your Questions?. HW10 problems Anything else. Previous class days' material Reading Assignments. Recap: Normal Forms for Grammars. Chomsky Normal Form , in which all rules are of one of the following two forms:
E N D
MA/CSSE 474 Theory of Computation Pushdown Automata (PDA) Intro
Your Questions? • HW10 problems • Anything else • Previous class days' material • Reading Assignments
Recap: Normal Forms for Grammars Chomsky Normal Form, in which all rules are of one of the following two forms: ● Xa, where a, or ● XBC, where B and C are elements of V - . Advantages: ● Parsers can use binary trees. ● Exact length of derivations is known: S A B A A B B aab B B bb
Recap: Normal Forms for Grammars Greibach Normal Form, in which all rules are of the following form: ● Xa, where a and (V - )*. Advantages: ● Every derivation of a string s contains |s| rule applications. ● Greibach normal form grammars can easily be converted to pushdown automata with no - transitions. This is useful because such PDAs are guaranteed to halt.
Theorems: Normal Forms Exist Theorem: Given a CFG G, there exists an equivalent Chomsky normal form grammar GC such that: L(GC) = L(G) – {}. Proof: The proof is by construction. Theorem:Given a CFG G, there exists an equivalent Greibach normal form grammar GG such that: L(GG) = L(G) – {}. Proof: The proof is also by construction. Details of Chomsky conversion are complex but straightforward; I leave them for you to read in Chapter 11 and/or in the next 16 slides. Details of Greibach conversion are more complex but still straightforward; I leave them for you to read in Appendix D if you wish (not req'd).
The Price of Normal Forms EE + E E (E) Eid Converting to Chomsky normal form: EEE EPE ELE EE R Eid L ( R ) P + Conversion doesn’t change weak generative capacity but it may change strong generative capacity.
Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no, AND if yes, describe the structure a + b * c
Definition of a Pushdown Automaton M = (K, , , , s, A), where: K is a finite set of states is the input alphabet is the stack alphabet sK is the initial state AK is the set of accepting states, and is the transition relation. It is a finite subset of (K ( {}) *) (K*) state input or string of state string of symbols symbols to pop to push from top on top of stack of stack and are not necessarily disjoint
Definition of a Pushdown Automaton • A configuration of M is an element of K* *. • The initial configuration of M is (s, w, ), where w is the input string.
Manipulating the Stack c will be written as cab a b If c1c2…cn is pushed onto the stack: c1 c2 cn c a b c1c2…cncab
Yields Let c be any element of {}, Let 1, 2 and be any elements of *, and Let w be any element of *. Then: (q1, cw, 1) |-M (q2, w, 2) iff ((q1, c, 1), (q2, 2)) . Let |-M* be the reflexive, transitive closure of |-M. C1yields configuration C2 iff C1 |-M*C2
Computations A computation by M is a finite sequence of configurations C0, C1, …, Cn for some n 0 such that: ● C0 is an initial configuration, ● Cn is of the form (q, , ), for some state qKM and some string in *, and ● C0 |-MC1 |-MC2 |-M … |-MCn.
Nondeterminism If M is in some configuration (q1, s, ) it is possible that: ● contains exactly one transition that matches. ● contains more than one transition that matches. ● contains no transition that matches.
Accepting A computation C of M is an accepting computation iff: ● C = (s, w, ) |-M* (q, , ), and ● qA. Maccepts a string w iff at least one of its computations accepts. Other paths may: ● Read all the input and halt in a nonaccepting state, ● Read all the input and halt in an accepting state with the stack not empty, ● Loop forever and never finish reading the input, or ● Reach a dead end where no more input can be read. The language accepted byM, denoted L(M), is the set of all strings accepted by M.
Rejecting A computation C of M is a rejecting computation iff: ● C = (s, w, ) |-M* (q, w, ), ● C is not an accepting computation, and ● M has no moves that it can make from (q, , ). Mrejects a string w iff all of its computations reject. Note that it is possible that, on input w, M neither accepts nor rejects.
A PDA for Bal M = (K, , , , s, A), where: K = {s} the states = {(, )} the input alphabet = {(} the stack alphabet A = {s} contains: ((s, (, ), (s, ( )) ** ((s, ), ( ), (s, )) **Important: This does not mean that the stack is empty
A PDA for {wcwR: w {a, b}*} M = (K, , , , s, A), where: K = {s, f} the states = {a, b, c} the input alphabet = {a, b} the stack alphabet A = {f} the accepting states contains: ((s, a, ), (s, a)) ((s, b, ), (s, b)) ((s, c, ), (f, )) ((f, a, a), (f, )) ((f, b, b), (f, ))
A PDA for PalEven ={wwR: w {a, b}*} S SaSa SbSb A PDA: This one is nondeterministic
More on NondeterminismAccepting Mismatches L = {ambn: mn; m, n > 0} Start with the case where n = m: b/a/ a//a b/a/ 1 2
More on NondeterminismAccepting Mismatches L = {ambn: mn; m, n > 0} Start with the case where n = m: b/a/ a//a b/a/ 1 2 ● If stack and input are empty, halt and reject. ● If input is empty but stack is not (m > n) (accept): ● If stack is empty but input is not (m < n) (accept):
More on NondeterminismAccepting Mismatches L = {ambn: mn; m, n > 0} b/a/ a//a b/a/ 2 1 ● If input is empty but stack is not (m < n) (accept): b/a/ a//a /a/ /a/ b/a/ 2 1 3
More on NondeterminismAccepting Mismatches L = {ambn: mn; m, n > 0} b/a/ a//a b/a/ 2 1 ● If stack is empty but input is not (m > n) (accept): b// b/a/ a//a b// b/a/ 2 4 1
Putting It Together L = {ambn: mn; m, n > 0} ● Jumping to the input clearing state 4: Need to detect bottom of stack. ● Jumping to the stack clearing state 3: Need to detect end of input.
The Power of Nondeterminism Consider AnBnCn = {anbncn: n 0}. PDA for it?
The Power of Nondeterminism Consider AnBnCn = {anbncn: n 0}. Now consider L = AnBnCn. L is the union of two languages: 1. {w {a, b, c}* : the letters are out of order}, and 2. {aibjck: i, j, k 0 and (ij or jk)} (in other words, unequal numbers of a’s, b’s, and c’s).
Are the Context-Free Languages Closed Under Complement? AnBnCn is context free. If the CF languages were closed under complement, then AnBnCn = AnBnCn would also be context-free. But we will prove that it is not.
L = {anbmcp: n, m, p 0 and nm or mp} SNC /* nm, then arbitrary c's SQP /* arbitrary a's, then pm NA /* more a's than b's NB /* more b's than a's Aa AaA AaAb Bb BBb BaBb C | cC /* add any number of c's PB' /* more b's than c's PC' /* more c's than b's B'b B'bB' B'bB'c C'c | C'c C'C'c C'bC'c Q | aQ /* prefix with any number of a's
Reducing Nondeterminism ● Jumping to the input clearing state 4: Need to detect bottom of stack, so push # onto the stack before we start. ● Jumping to the stack clearing state 3: Need to detect end of input. Add to L a termination character (e.g., $)
Reducing Nondeterminism ● Jumping to the input clearing state 4:
Reducing Nondeterminism ● Jumping to the stack clearing state 3:
More on PDAs A PDA for {wwR : w {a, b}*}: What about a PDA to accept {ww : w {a, b}*}?
PDAs and Context-Free Grammars Theorem: The class of languages accepted by PDAs is exactly the class of context-free languages. Recall: context-free languages are languages that can be defined with context-free grammars. Restate theorem: Can describe with context-free grammar Can accept by PDA
Going One Way Lemma: Each context-free language is accepted by some PDA. Proof (by construction): The idea: Let the stack do the work. Two approaches: • Top down • Bottom up
Top Down The idea: Let the stack keep track of expectations. Example: Arithmetic expressions EE + T ET TTF TF F (E) Fid (1) (q, , E), (q, E+T) (7) (q, id, id), (q, ) (2) (q, , E), (q, T) (8) (q, (, ( ), (q, ) (3) (q, , T), (q, T*F) (9) (q, ), ) ), (q, ) (4) (q, , T), (q, F) (10) (q, +, +), (q, ) (5) (q, , F), (q, (E) ) (11) (q, , ), (q, ) (6) (q, , F), (q, id)
A Top-Down Parser The outline of M is: M = ({p, q}, , V, , p, {q}), where contains: ● The start-up transition ((p, , ), (q, S)). ● For each rule Xs1s2…sn. in R, the transition: ((q, , X), (q, s1s2…sn)). ● For each character c, the transition: ((q, c, c), (q, )).
Example of the Construction L = {anb*an} 0 (p, , ), (q, S) (1) S * 1 (q, , S), (q, ) (2) S B 2 (q, , S), (q, B) (3) S aSa 3 (q, , S), (q, aSa) (4) B 4 (q, , B), (q, ) (5) B bB 5 (q, , B), (q, bB) 6 (q, a, a), (q, ) input = a a b b a a 7 (q, b, b), (q, ) trans state unread input stack p a a b b a a 0 q a a b b a a S 3 q a a b b a aaSa 6 q a b b a a Sa 3 q a b b a aaSaa 6 q b b a a Saa 2 q b b a a Baa 5 q b b a abBaa 7 q b a a Baa 5 q b a abBaa 7 q a a Baa 4 q a aaa 6 q aa 6 q
Another Example L = {anbmcpdq : m + n = p + q} (1) SaSd (2) ST (3) SU (4) TaTc (5) TV (6) U bUd (7) UV (8) VbVc (9) V input = a a b c d d
Another Example L = {anbmcpdq : m + n = p + q} 0 (p, , ), (q, S) (1) SaSd 1 (q, , S), (q, aSd) (2) ST 2 (q, , S), (q, T) (3) SU 3 (q, , S), (q, U) (4) TaTc 4 (q, , T), (q, aTc) (5) TV 5 (q, , T), (q, V) (6) U bUd 6 (q, , U), (q, bUd) (7) UV 7 (q, , U), (q, V) (8) VbVc 8 (q, , V), (q, bVc) (9) V 9 (q, , V), (q, ) 10 (q, a, a), (q, ) 11 (q, b, b), (q, ) input = a a b c d d 12 (q, c, c), (q, ) 13 (q, d, d), (q, ) trans state unread input stack
The Other Way to Build a PDA - Directly L = {anbmcpdq : m + n = p + q} (1) SaSd(6) UbUd (2) ST (7) UV (3) SU (8) VbVc (4) TaTc(9) V (5) TV input = a a b c d d
The Other Way to Build a PDA - Directly L = {anbmcpdq : m + n = p + q} (1) SaSd (6) UbUd (2) ST (7) UV (3) SU (8) VbVc (4) TaTc (9) V (5) TV input = a a b c d d a//a b//a c/a/ d/a/ b//a c/a/ d/a/ 1 2 3 4 c/a/ d/a/ d/a/
Notice Nondeterminism Machines constructed with the algorithm are often nondeterministic, even when they needn't be. This happens even with trivial languages. Example: AnBn = {anbn: n 0} A grammar for AnBn is: A PDA M for AnBn is: (0) ((p, , ), (q, S)) [1] SaSb (1) ((q, , S), (q, aSb)) [2] S (2) ((q, , S), (q, )) (3) ((q, a, a), (q, )) (4) ((q, b, b), (q, )) But transitions 1 and 2 make M nondeterministic. A directly constructed machine for AnBn:
Bottom-Up The idea: Let the stack keep track of what has been found. (1) EE + T (2) ET (3) TTF (4) TF (5) F (E) (6) Fid Reduce Transitions: (1) (p, , T + E), (p, E) (2) (p, , T), (p, E) (3) (p, , FT), (p, T) (4) (p, , F), (p, T) (5) (p, , )E( ), (p, F) (6) (p, , id), (p, F) Shift Transitions (7) (p, id, ), (p, id) (8) (p, (, ), (p, () (9) (p, ), ), (p, )) (10) (p, +, ), (p, +) (11) (p, , ), (p, )
A Bottom-Up Parser The outline of M is: M = ({p, q}, , V, , p, {q}), where contains: ● The shift transitions: ((p, c, ), (p, c)), for each c. ● The reduce transitions: ((p, , (s1s2…sn.)R), (p, X)), for each rule Xs1s2…sn. in G. ● The finish up transition: ((p, , S), (q, )).