770 likes | 1.01k Views
Lecture#27-31. Unit-IV. Context Free Languages. A context-free grammar is a simple recursive way of specifying grammar rules by which strings of a language can be generated. All regular languages, and some non-regular languages, can be generated by context-free grammars.
E N D
Lecture#27-31 Unit-IV
Context Free Languages • A context-free grammar is a simple recursive way of specifying grammar rules by which strings of a language can be generated. • All regular languages, and some non-regular languages, can be generated by context-free grammars. Unit-IV
Context Free Languages • Regular Languages are represented by regular expressions • Context Free Languages are represented by a context-free grammar Unit-IV
Context Free Languages • Regular Languages are accepted by deterministic finite automata (DFAs). • Context Free Languages are accepted by pushdown automata, which are non-deterministic finite state automata with a stack as auxiliary memory. • Note that pushdown automata which are deterministic can represent some but not all of the context-free languages. Unit-IV
Definition A context-free grammar (CFG) is a 4-tuple G = (V, T, S, P) where V and T are disjoint sets, S V, and P is a finite set of rules of the form A x, where A V and x (V T)*. V = non-terminals or variables T = terminals S = Start symbol P = Productions or grammar rules Unit-IV
Example Let G be the CFG having productions: S aSa | bSb | c Then G will generate the language L = {xcxR | x {a, b}*} This is the language of odd palindromes with a single isolated character in the middle.
Memory • What kind of memory do we need to be able to recognize strings in L, using a single left-to-right pass? • Example: aaabbcbbaaa • We need to remember what we saw before the c • We could push the first part of the string onto a stack and, when the c is encountered, start popping characters off of the stack and matching them with each character from the center of the string on to the end. • If everything matches, this string is an odd palindrome. Unit-IV
Counting • We can use a stack for counting out equal numbers of a’s and b’s on different sides of a center marker. • Example: L = ancbn aaaacbbbb • Push the a’s onto the stack until you see a c, then pop an a off and match it with a b whenever you see a b. • If we finish processing the string successfully (and there are no more a’s on our stack), then the string belongs to L. Unit-IV
Definition 7.1: Pushdown Automaton A nondeterministic pushdown automaton (NPDA) is a 7-tuple M = (Q, S, G, q0, d, z, F), where Q is a finite set of states S is the input alphabet (a finite set) G is the stack alphabet (a finite set) d : Q (S {l}) G (finite subsets of Q G*) is the transition function q0 Q is the start state z G is the initial stack symbol F Q is the set of accepting states Unit-IV
Transition rules So we can fully specify any NPDA like this: Q = {q0, q1, q2, q3} S = {a, b} G = {0, 1} q0 is the start state z = # (the starting stack marker) F = {q3} d is the transition function: Unit-IV
Transition rules δ(q0, a, #) {(q1, 1#), (q3, λ)} δ(q0, λ, #) {(q3, λ)} δ(q1, a, 1) {(q1, 11)} δ(q1, b, 1) {(q2, λ)} δ(q2, b, 1) {(q2, λ)} δ(q2, λ, #) {(q3, λ)} This PDA is nondeterministic. Why? Unit-IV
Transition rules Note that in an FSA, each rule told us that when we were in a given state and saw a specific character, we moved to a specified state. In a PDA, we also need to know what is on the stack before we can decide what new state to move to. After moving to the new state, we also need to decide what to do with the stack. Unit-IV
Working with a stack: • You can only access the top element of the stack. • To access the top element of the stack, you have to POP it off the stack. • Once the the top element of the stack has been POPped, if you want to save it, you need to PUSH it back onto the stack immediately. • Characters from the input string must be read one character at a time. You cannot back up. • The current configuration of the machine includes: the current state, the remaining characters left in the input string, and the entire contents of the stack Unit-IV
L={anbn:n0} {a} • In the previous example we had two key transitions: • δ(q1, a, 1) {(q1, 11)}, which adds a 1 to the stack when an a is read • δ(q1, b, 1) {(q2, λ)}, which removes a 1 when a b is encountered • We also have the rule: δ(q0, a, #) {(q1, 1#), (q3, λ)}, which allows us to transition directly to the acceptance state, q3, if we initially see an a Unit-IV
Instantaneous description Given the transition function d : Q (S {l}) G (finite subsets of Q G*) a configuration, or instantaneous description, of M is a snapshot of the current status of the PDA. It consists of a triple: (q, w, u) where: q Q (q is the current state of the control unit) w S* (w is the remaining unread part of the input string), and u G* (u is the current stack contents, with the leftmost symbol indicating the top of the stack) Unit-IV
Instantaneous description To indicate that the application of a transition rule has caused our PDA to move from one state to another, we use the following notation: (q1, aw, bx) |- (q2, w, yx) To indicate that we have moved from one state to another via the application of several rules, we use: (q1, aw, bx) |-* (q2, w, yx) or (q1, aw, bx) |-M* (q2, w, yx) to indicate a specific PDA Unit-IV
Definition 7.2: Acceptance If M = (Q, S, G, d, q0, z, F) is a push-down automaton and w S*, the string w is accepted by M if: (q0, w, #) |-M* (qf, l, u) for some u G* and some qf F. This means that we start at the start state, with the stack empty, and after processing the string w, we end up in an accepting state, with no more characters left to process in the original string. We don’t care what is left on the stack. This is called acceptance by final state. Unit-IV
2 types of acceptance An alternate type of acceptance is acceptance by empty stack. This means that we start at the start state, with the starting stack symbol, and after processing the string w, we end up with no more characters left to process in the original string, and no more characters (except the empty-stack character, λ) left on the stack. Unit-IV
2 types of acceptance The two types of acceptance are equivalent; if we can build a PDA to accept language L via acceptance by final state, we can also build a PDA to accept L via acceptance by empty stack. Unit-IV
Definition 7.2: Acceptance A language L S* is said to be accepted by M if L is precisely the set of strings accepted by M. In this case, we say that L = L(M). Unit-IV
Determinism/non-determinism: • A deterministic PDA must have only one transition for any given pair of input symbol/ stack symbol. • A non-deterministic PDA (NPDA) may have no transition or several transitions defined for a particular input symbol/stack symbol pair. • In an NPDA, there may be several paths to follow to process a given input string. Some of the paths may result in accepting the string. Other paths may end in a non-accepting state. An NPDA can “guess” which path to follow through the machine in order to accept a string. Unit-IV
Example: anbcn b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l L = {anbcn | n>0} Unit-IV
Production rules for anbcn Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l a a b c c Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l a b c c Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l b c c Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l c c Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l c Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l λ Unit-IV
Example: aabcc b, a / a l,#/# q2 q0 q1 a, #/ a# a, a / aa c, a / l λ Unit-IV
Example: Odd palindrome c, #/ # c, a / a c, b / b l, #/ # q2 q0 q1 a, a / l b, b / l a, #/ a# b, #/ b# a, a / aa b, a / ba a, b / ab b, b / bb L = {xcxR | x {a, b}*} Unit-IV
Processing abcba Unit-IV
Processing ab Unit-IV
Processing acaa Unit-IV
Crashing: What is happening in the last example? We process the first 3 letters of acaa and are in state q1. We have an a left to process in our input string. We have the empty-stack marker as the top character in our stack. Rule 12 says that if we are in state q1 and have # on the stack, then we can make a free move (a l-move) to q2, pushing # back onto the stack. So this is legal. So far, the automaton is saying that it would accept aca. But note that we are in state q2 and we still have the last a in our input string left to process. There are no rules like this. On the next move, when we try to process the a, the automaton will crash, rejecting acaa. Unit-IV
Example: Even palindromes Consider the following context-free language: L = {wwR | w {a, b}*} This is the language of all even-length palindromes over {a, b}. Unit-IV
Example: Even palindromes This PDA is non-deterministic. Note moves 7, 8, and 9. Here the PDA is “guessing” where the middle of the string occurs. If it guesses correctly (and if the PDA doesn’t accept any strings that aren’t actually in the language), this is OK. Unit-IV
Example: Even palindromes (q0, baab, #) |- (q0, aab, b#) |- (q0, ab, ab#) |- (q1, ab, ab#) |- (q1, b, b#) |- (q1, λ, #) |- (q2, λ, #) (accept) Unit-IV
Example: All palindromes Consider the following context-free language: L = pal = {x {a, b}* | x = xR} This is the language of all palindromes, both odd and even, over {a, b}. Unit-IV
Transition rules forAll palindromes At each point before we start processing the second half of the string, there are three possibilities: 1. The next input character is still in the first half of the string and needs to be pushed onto the stack to save it. 2.The next input character is the middle symbol of an odd-length string and should be read and thrown away (because we don’t need to save it to match it up with anything). 3. The next input character is the first character of the second half of an even-length string. Unit-IV
Transition rules forAll palindromes Why is this PDA non-deterministic? Note the first 6 rules of this NPDA. This PDA is obviously non-deterministic, because in each of these rules, there are two moves that may be chosen. Unit-IV
Transition rules forAll palindromes Each move in a PDA has three pre-conditions: the current state you are in, the next character to be processed from the input string, and the top character on the stack. In rule 1, our current state is q0, the next character in the input string is a, and the top character on the stack is the empty-stack marker. But there are two possible moves for this one set of preconditions: 1) move back to state q0 and push a# onto the stack or 2) move to state q1 and push # onto the stack Whenever we have multiple moves possible from a given set of preconditions, we have nondeterminism. Unit-IV
Definition 7.3: • Let M = (Q, S, G, q0, z, A, d), be a pushdown automaton. M is deterministic if there is no configuration for which M has a choice of more than one move. In other words, M is deterministic if it satisfies both of the following: • For any q Q, a S {l}, and X G, the set d(q, a, X) has at most one element. • For any q Q and X G, if d(q, l, X) , then d(q, a, X) = for every a S. • A language L is a deterministic context-free language if there is a deterministic PDA (DPDA) accepting L. Unit-IV
Definition 7.3: If M is deterministic, then multiple moves for a single input/stack configuration are not allowed. That is: • Given stack = Y and input = X, there cannot exist another move with the same stack value and the same input from the same state. • There may be l-productions, BUT for input = l and stack = X, there cannot exist another move with stack = X, from the same state. Unit-IV
Non-determinism • Some PDA’s which are initially described in a non-deterministic way can also be described as deterministic PDA’s. • However, some CFLs are inherently non-deterministic, e.g.: L = pal = {x {a, b}* | x = xR} cannot be accepted by any DPDA. Unit-IV
Example: L = {w {a, b}* | na(w) > nb(w)} This is the set of all strings over the alphabet {a, b} in which the number of a’s is greater than the number of b’s. This can be represented by either an NPDA or a DPDA. Unit-IV
Example (NPDA): L = {w {a, b}* | na(w) > nb(w)} Unit-IV