Theory of Computation

Theory of Computation Theory of Computation Peer Instruction Lecture Slides by Dr. Cynthia Lee, UCSD are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.Based on a work at www.peerinstruction4cs.org.

Why??? Real-World Applications

Markov Chains Used to Recognize Many Forms of “Semi-Structured Text” • XML file, database file = highly structured text • Each field is clearly labeled according to a strict schematic • Easy to write programs to parse and process these—they are designed for the strict, rigid world of computer programs • Resumes, invitations, contact information = semi-structured text • Have a wide variety of formats • A fixed model like an RE would have a very hard time with these formats • But they still have many elements in common, and prevailing norms for formatting • Need a slightly more flexible model

Markov Chains • Markov Chains are Finite Automata (like DFAs and NFAs) where the transitions are probabilistic • Extensive use in real-world artificial intelligence applications, such as tasks having to do with semi-structured text documents • Ex: In an invitation, a block of text following “What:” is 90% likely to be the title of the event, and after that there is a 50% of seeing “Date:” and a 50% chance of seeing “When:”, both of which are 90% likely to be followed by a date and/or time.

Logicians and mathematicians and computer scientists FamousPEople

If a language isn’t regular, it just might be one of the… Context-Free LanguagesPDA

Understanding Pushdown Automata The edge from q0 to q1 does what? • Lets you go to q1 without reading any input (on epsilon) • Lets you go to q1 on epsilon or ‘$’ • Lets you go to q1 on ‘$’

Understanding Pushdown Automata In addition to what we said in the last question, the edge from q0 to q1 does what else? Pushes one entry onto the stack (epsilon) Pushes one entry onto the stack (‘$’) Pushes two entries onto the stack (epsilon and ‘$’) Pops two entries off the stack (epsilon and ‘$’)

Tracing in a Pushdown Automaton b Input “aabb” into this PDA. After “aa” has been read, what is on the stack? top of stack bottom of stack 

Language of a Pushdown Automaton b Which is the bestdescription of the language of the given PDA? • { w | number of b’s in w >= number of a’s in w} • {w | w = anbn+1for some n>=0} • { w | w = anbn+2 for some n>=0} • { w | w = anb2nfor some n>=0} • {w | w = 0anb2n0 for some n>=0}

Tracing in a Pushdown Automaton e • Which string is NOT accepted by this PDA? • aabb • abbbc • abbccc • aabcc • None or more than one of the above

Tracing in a Pushdown Automaton c • Which choice depicts a stack state that occurs at some point during the successful* processing of the string “aaabbbc” on the given PDA? top of stack bottom of stack  • None of the above * Ignore all nondeterministic paths that end in rejecting/getting stuck.

Why did we push ‘#’ onto the stack? e • We didn’t have to, because we already “counted” the a’s by pushing them on the stack • It’s something we do because of convention, but it isn’t necessary to correctness • We did it to make sure that we didn’t cause a crash/error by trying to pop something off an empty stack • It is necessary to correctness • None or more than one of the above

CFLs! The Class of Context-Free Languages

Which Venn diagram best represents the classes of languages we have studied? (a) (b) CFLs CFLs RLs RLs (c) (d) (e) None of the above Note: CFLs = Context-Free Languages, RLs = Regular Languages CFLs = RLs RLs CFLs

What methods can we use to prove/disprove each of these? (a) (b) CFLs CFLs RLs RLs (c) (d) Note: CFLs = Context-Free Languages, RLs = Regular Languages CFLs = RLs RLs CFLs

Famous People: Noam Chomsky • In this class, you know him as the namesake of “Chomsky Normal Form” • A linguist who has taught at MIT for 55 years • Famous for: Developed theories of conetxt-free grammars for analysis of human language • A mathematical model of language • Discoveries crossed over into Computer Science • Perhaps now equally well known for his outspoken criticism of US foreign policy and his radical political views • Anarchist • Vehement war critic, war on drugs critic • Co-wrote Manufacturing Consent, argues that the media in our society is harming democracy and promoting corporations and consumerism

Proving that languages that are NOT context-free Context-Free PUMPING LEMMA

Which Venn diagram best represents the classes of languages we have studied? (a) (b) CFLs CFLs RLs RLs (c) (d) (e) None of the above Note: CFLs = Context-Free Languages, RLs = Regular Languages CFLs = RLs RLs CFLs

Limits of Context-Free Languages • What are the limitations of Context-Free languages • What are the limitations of PDA? • Stack size has no limit (infinite stack) • ….BUT, there’s only one of them • Stack can only be accessed at the top • What does that mean intuitively? When would you need more than one stack? Not Context-Free Context-Free Regular

PDA Language is 0n12n Can we change this so it is 0n12n0n?

Classic Not-Context-Free Language • anbncn for some n>=0 • INTUITIVELY, the problem is that a PDA recognizing this language would try to: • Push on the stack to count the ‘a’ section, then • Pop off the stack to match the ‘b’ section, then • You’ve “forgotton” n, now you can’t count the ‘c’ section • Aside: what could you do with a second stack? • THIS IS NOT A PROOF!! • Maybe there is a completely different way to approach this problem using PDA, which we just didn’t think of yet • But, turns out, there is no way to do this, and we can actually prove that

Aside: Recognizing anbncn for some n>=0 using a second stack

CFL Pumping Lemma

Proving {aibjck | 0<=i<=j<=k} is not Context-Free • s = apbpcp • i = ??? • Can’t solve it with just one i! • For the case analysis in this proof, we need to use i=0 for some cases, and i=2 for other cases • Is that legal??? • Yes. • The Pumping Lemma Game shows us why…

The REGULAR LANGUAGES Pumping Lemma Game Your Script Pumping Lemma’s Script “Thanks. For the language L that you’ve given me, I pick this nice pumping length I call p.” “Great string, thanks. I’ve cut s up into parts xyz for you. I won’t tell you what they are exactly, but I will say this: |y| > 0 and |xy| <= p. Also, you can remove y, or copy it as many times as you like, and the new string will still be in L, I promise!” “Well, then L wasn’t a regular language. Thanks for playing.” • “I’m giving you a language L that I’m assuming is regular.” • “Excellent. I’m giving you this string s that I made using your p. It is in L and |s| >= p. I think you’ll really like it.” • “Hm. I followed your directions for xyz, but when I [copy y N times or delete y], the new string is NOT is L! What happened?”

The CONTEXT-FREE LANGUAGES Pumping Lemma Game Your Script Pumping Lemma’s Script “Thanks. For the language L that you’ve given me, I pick this nice pumping length I call p.” “Great string, thanks. I’ve cut s up into parts uvxyz for you. I won’t tell you what they are exactly, but I will say this: |vy| > 0 and |vxy| <= p. Also, you can remove v and y, or copy them as many times as you like (as in uvixyiz), and the new string will still be in L, I promise!” “Well, then L wasn’t a context-free language. Thanks for playing.” • “I’m giving you a language L that I’m assuming is context-free.” • “Excellent. I’m giving you this string s that I made using your p. It is in L and |s| >= p. I think you’ll really like it.” • “Hm. I followed your directions for uvxyz, but when I [copy vy N times or delete vy], the new string is NOT is L! What happened?”

Review reviewreview… MiDTERM Review

From last year’s Final Exam

From last year’s midterm: 5. To prove a language is notregular, which method can be used? • Show several DFAs or NFAs that almost, but not quite, recognize the language, and then conclude that no DFA or NFA can recognize the language. • Use the Pumping Lemma for regular languages. • Show a CFG that recognizes the language, and then conclude that the language is context-free, not regular. • (b) and (c) • None of the above.

(a) TRUE or (b) FALSE The following proof is valid: • A = {w | w=an for n>=0} is a regular language, and B = {w | w=bn for n>=0} is a regular language because we can produce DFA’s MA and MB for A and B, respectively (see drawings). Regular languages are closed under concatenation, therefore language AB = {w | w=anbn for n>=0} is a regular language. (valid in this case means the flow of logic is sound, even though may not do a great job of stating GIVEN/WANT TO SHOW, etc)

Homework #1 Problem #4

Theory of Computation