1 / 25

Lecture 16 Oct 18

Lecture 16 Oct 18. Context-Free Languages (CFL) - basic definitions Examples. Context-Free Languages (Ch. 2). Context-free languages allow us to describe non-regular languages like { 0 n 1 n | n  0}.

bliss
Download Presentation

Lecture 16 Oct 18

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 16 Oct 18 • Context-Free Languages (CFL) - basic definitions • Examples

  2. Context-Free Languages (Ch. 2) Context-free languages allow us to describe non-regular languages like { 0n1n | n0} General idea: CFL’s are languages that can be recognized by automata that have one stack: { 0n1n | n0} is a CFL { 0n1n0n | n0} is not a CFL

  3. Context-Free Grammars Start symbol S with rewrite rules: 1) S  0S1 2) S  e S yields 0n1n : S  0S1  00S11  …  0nS1n  0n1n

  4. Context-Free Grammars (Def.) • A context free grammar G=(V,,R,S) is defined by • V: a finite set variables (non-terminals) • : finite set terminals (with V=) • R: finite set of substitution rules V  (V)* • S: start symbol V • The language of grammar G is denoted by L(G): • L(G) = { w* | S * w }

  5. Derivation * A single step derivation “” consist of the substitution of a variable by a string accordingto a substitution rule. Example: with the rule “ABB”, we can havethe derivation “01AB0  01BBB0”. A sequence of several derivations (or none)is indicated by “ * ”Same example: “0AA * 0BBBB”

  6. Some Remarks The language L(G) = { w* | S * w } contains only strings of terminals, not variables. Notation: we summarize several rules, like A  B A  01 by A  B | 01 | AA A  AA Unless stated otherwise: topmost rule has the start variable on the left side.

  7. Context-Free Grammars (Ex.) Consider the CFG G=(V,,R,S) with V = {S}  = {0,1} R: S  0S1 | 0Z1 Z 0Z |  Then L(G) = {0i1j | ij and j > 0} S yields0j+k1j according to:S  0S1  …  0jS1j  0jZ1j  0j0Z1j …  0j+kZ1j  0j+k1j = 0j+k1j

  8. Importance of CFL • Model for natural languages (Chomsky) • Specification of programming languages: “parsing of a computer program” • parser for HTML (and some special cases of SGML) • Describes mathematical structures. • Intermediate between regular languages and other language families of Chomsky hierarchy

  9. Set of boolean expressions is a CFL Consider the CFG G=(V,,R,S) with V = {S}  = {0,1,(,),,,} R: S  0 | 1 | (S) | (S)(S) | (S)(S) Some elements of L(G): 0 (((0))(1)) (1)((0)(0)) Note: Parentheses prevent “100” confusion. This language requires full-parenthesizing.

  10. A very small subset of English Rules:<SENTENCE>  <NOUN-PHRASE><VERB-PHRASE> <NOUN-PHRASE>  <CMPLX-NOUN> | <CMPLX-NOUN><PREP-PHRASE><VERB-PHRASE>  <CMPLX-VERB> | <CMPLX-VERB><PREP-PHRASE> <CMPLX-NOUN>  <ARTICLE><NOUN> <CMPLX-VERB>  <VERB> | <VERB><NOUN-PHRASE> … <ARTICLE>  a | the<NOUN>  boy | girl | house<VERB>  sees | ignores A string that can be generated by this grammar: the boy sees the girl

  11. ( )  S ( ) S 0 S ( ) ( ) S 0 1 Parse Trees The parse tree of (0)((0)(1)) via rule S  0 | 1 | (S) | (S)(S) | (S)(S): S

  12. Ambiguity A grammar is ambiguous if some strings are derived ambiguously. A string is derived ambiguously if it has more than one leftmost derivations or more than one parse tree. Typical example: rule S  0 | 1 | S+S | SS S  S+S  SS+S  0S+S  01+S  01+1 versus S  SS  0S  0S+S  01+S  01+1

  13. S S S  S S + S 0 1 S S  + S S 0 1 1 1 Ambiguity and Parse Trees The ambiguity of 01+1 is shown by the two different parse trees:

  14. More on Ambiguity The two different derivations: S  S+S  0+S  0+1and S  S+S  S+1  0+1 do not constitute an ambiguous string 0+1 (they will have the same parse tree) Languages that can only be generated by ambiguous grammars are “inherently ambiguous”

  15. Context-Free Languages Any language that can be generated by a contextfree grammar is a context-free language (CFL). The CFL { 0n1n | n0 } shows us that certain CFLs are nonregular languages. Q1: Are all regular languages context free?Q2: Which languages are outside the class CFL?

  16. Example. A context-free grammar for the set of strings over {0,1} with an equal number of 0’s and 1’s. We need a grammar for the language L = { w | w has an equal number of 0’s and 1’s} Consider the grammar: S  0 S 1 | 1 S 0 | e This does not cover all the strings. Exhibit a string in L that is not generated by G.

  17. We need to add the rule S  SS The complete grammar is: S 0S1 | 1S0 | e | SS How do we get a string like 011010? S  SS  S S S  0S1SS  01SS 011S0S  0110S  01101S0  011010 How do we show that L = L(G)? We need the following Lemma: Let x be a string in L. Then, (exactly) one of the following holds: (a) x = 0 y 1 for some y in L, (b) x = 1 y 0 for some y in L, or (c) x = yz for some y and z (both of them non-null) such that both y and z are in L.

  18. Chomsky Normal Form A context-free grammar G = (V,,R,S) is in Chomsky normal form if every rule is of the form A  BC or A  x with variables AV and B,CV \{S}, and x  For the start variable S we also allow the rule S   Advantage: Grammars in this form are far easier to analyze.

  19. Theorem 2.6 Every context-free language can be describedby a grammar in Chomsky normal form. Outline of Proof:We rewrite every CFG in Chomsky normal form.We do this by replacing, one-by-one, every rulethat is not in Chomsky Normal Form (CNF). We have to take care of: Starting Symbol,  symbol, all other violating rules.

  20. Proof Theorem 2.6 Given a context-free grammar G = (V,,R,S),rewrite it to Chomsky Normal Form by 1) New start symbol S0 (and add rule S0S) 2) Remove A rules (from the tail): before: BxAy and A, after: B xAy | xy 3) Remove unit rules AB (by the head): “AB” and “BxCy”, becomes “AxCy” and “BxCy” 4) Shorten all rules to two: before: “AB1B2…Bk”, after: AB1A1, A1B2A2,…, Ak-2Bk-1Bk 5) Replace ill-placed terminals “a” by Ta with Taa

  21. Careful Removing of Rules Do not introduce new rules that you removedearlier. Example: AA simply disappears When removing A rules, insert all newreplacements: BAaA becomes B AaA | aA | Aa | a

  22. Example of CNF Initial grammar: S aSb |  In Chomsky normal form: S0   | TaTb | TaX X  STbS  TaTb | TaX Ta  a Tb  b

  23. RL  CFL Every regular language can be expressed by a context-free grammar. Proof Idea:Given a DFA M = (Q,,,q0,F), we construct a corresponding CF grammar GM = (V,,R,S)with V = Q and S = q0 Rules of GM: qi x(qi,x) for all qiV and all xqi  for all qiF

  24. 0 1 1 0 q1 q2 q3 0,1 Example RL  CFL The DFA leads to thecontext-free grammarGM = (Q,,R,q1) with the rules q1 0q1 | 1q2 q2 0q3 | 1q2 |  q3 0q2 | 1q2

  25. Picture Thus Far ?? context-free languages Regularlanguages { 0n1n }

More Related