1 / 22

Computability

Computability. Joke. Context-free grammars Parsing. Chomsky Homework: Design grammar for [simple] computer language. Proof by induction. Requires the subject domain to be classified by natural numbers: 0 or 1 or some starting point, and then all numbers following

leola
Download Presentation

Computability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computability Joke. Context-free grammars Parsing. Chomsky Homework: Design grammar for [simple] computer language

  2. Proof by induction • Requires the subject domain to be classified by natural numbers: 0 or 1 or some starting point, and then all numbers following • Prove a starting case, for example, N=1 • Prove either • if it is true for k, FOR ALL k, can prove it for k+1 • if it is true for p<k, FOR ALL k, can prove it for k • Think of induction step as short cut to proving theorem for 2, 3, 4, 5, … SO, with my screaming capital letters as a hint, what was wrong with the All horses are the same color proof?

  3. Preview on proofs • Another typical form of proof is by construction • build / design the FSM, etc. • Another is by contradiction: assuming result and show it leads to some falsehood • One category is: assuming you can make a list of all Xs…. then some special example must be on the list but then ….

  4. Hierarchy • Moving from languages defined by FSM (aka finite state automata), equivalent to non-deterministic FSM, equivalent to regular expressions to • Languages defined by Context-Free Grammars, equivalent to [non-deterministic ] push-down automata • will turn out that deterministic PDA are less powerful. • a FSM can be considered a special type of PDA

  5. Grammar • A grammar has a [finite] alphabet A (sometimes called terminals) plus a finite set V of variables. • Starting symbol S is a member of V. • A production rule is a mapping/substitution of strings • A grammar has a finite set of production rules • A context-free grammar has production rules of the form a single variable V to a string of symbols from A and V • V string of letters from A and V • A non-context-free production rule would be • aVb adWb, meaning, when V is in the context of a and b, then you can substitute dW • Can combine production rules using |

  6. Derivations • Applying the rules until there are no more variables, just terminals is a derivation. • A string is in the language defined by the grammar if there is a derivation. • Think of the variables as the parts of speech.

  7. Example • Let the alphabet of terminals be: (, ), +, *, v, w, x, y, z • Let the variables be • E, the starting symbol, think of it as expression • F, factor • OP, operator (I use two letters for readability) • Rules are • E  ( E ) | E OP E | F • F  v | w | x | y | z • OP  + | *

  8. Sample derivation E E OP E E  ( E ) E  E OP E E  F F  v OP  + E  F F  w etc. FINISH! Draw as a tree. Trees in computer science are upside down!

  9. Parsing a string is producing a set of rules, often recorded using a tree, that derive (cover) the string. So for the string (x+y) E  ( E ) E  E OP E E  F  x E  R  y

  10. Parsing • If there isn't a parse tree, then the string isn't in the language, though it may require some proof…

  11. Derivation vs Parsing • Opposite directions • The goal of parsing is to find a derivation that generates the string. • In compiling, parsing produces information that directs the compiler to generate code.

  12. Exercises Produce the tree(s) for • x • x + (v*w) • x + y * z • (x*y)+(v*w) • When are trees the same and when are they different? • ambiguity is when the trees are really different, not just expanded in a different order. This will be made formal next.

  13. Left most derivation • A derivation of a string w in a Grammar G is a leftmost derivation if at each step the leftmost remaining variable is the one replaced. • A string is derived ambiguously in a CF grammar if it has two or more different leftmost derivations. A grammar G is ambiguous if it generates some string ambiguously.

  14. Compare for ambiguity • Variables E, T (for term), F (for factor), alphabet {a, +, *, (, ) }Rules: E  E+T | T T  T * F | F F  (E) | a • Variables E, alphabet { a, +, *, (, ) }Rules: E  E+E | E*E | (E) | a Try each on the strings: a+a*a a+(a*a) (a+a)*a a+a+a+a

  15. Regular languages • All regular languages are context free languages! • Proof: Consider the FSM that recognizes a language. Define the following Context-free grammar: • alphabet for the FSM is the terminal alphabet • let each state of the FSM be a variable. Let the initial state be the initial variable. • Rules are: if there is an arrow from state V to state W labeled with letter a, then add the production rule: V  aW If state X is an accepting state, add the rule X  ∊ • So…strings generated by the grammar are the strings in the language.

  16. CF languages • Each regular language is CF, but not vice versa… • Recall B = {0i1i | i>=0}. B is strings with the a set number of 0s followed by the same number of 1s. This was shown to be non-regular. Let grammar be S  0S1 | ∊

  17. Chomsky normal form • A CF grammar is in Chomsky normal form if each rule is of the form • A  BC • A  a where A, B, and C are variables and a is any terminal and B and C are not the start variable S. It is permitted (but not required) to have the rule S  ∊ but no other variable can produce the empty string. • There are several other normal forms.

  18. Outline of proof for Chomsky NF Any context free language can be generated by a grammar in Chomsky normal form. • Create a new start variable to prevent the start variable being on the right • Eliminate A  ∊ rules. If there is a rule R  uAv, add rule R  uv. If R  uAvAw, add R  uvAw | uAvw | uvw • Remove unit rules A  B. If B  u, then add A  u (unless previously removed) • If A  u1u2…uk and k>=3, add new variable Ai and replace with A  u1A1, A1  u2A2, etc. If A  u1u2, replace with A  U1U2 and U1  u1 and U2  u2 Read on-line, Sipser text on reserve, videos, etc. for complete proof.

  19. Example: B = {0i1i | i>=0} Convert S  0S1 | ∊ to CNF • add new start and new rule: S0 S • remove S  ∊ and add S  01 | 0S1 and S0  ∊ • replace unit rule (S0  S):S0  01 | 0S1| ∊ and S  01 | 0S1 • address other problems by creating new variablesS0  A0A1 | A0A3 | ∊S  A0A1 | A0A3A0  0A1  1A3  SA1 Does this work (produce strings in the pattern)? Claim: yes, because notice that an A3 only arises if there was an A0 before it.

  20. Intuition…. • Context free grammars appear to be able to keep track of things…. • Even the leftmost derivation rule still has something like recursion.

  21. Preview • Will define push-down automata, a type of machine equivalent to context-free grammars for defining languages • Pumping lemma

  22. Classwork/Homework • Create a grammar for a simple programming language: • assignment statements • if statements • function calls • expressions can include function calls as well as operators and parentheses • terminals are names and numbers (lexical units) plus operators (+ and *) and parentheses, brackets, and ;

More Related