220 likes | 396 Views
CSI 3104 /Winter 2006 : Introduction to Formal Languages Chapter 13: Grammatical Format. Chapter 13: Grammatical Format I. Theory of Automata II. Theory of Formal Languages III. Theory of Turing Machines …. Chapter 13: Grammatical Format.
E N D
CSI 3104 /Winter 2006: Introduction to Formal Languages Chapter 13: Grammatical Format Chapter 13: Grammatical Format I. Theory of Automata II. Theory of Formal Languages III. Theory of Turing Machines … Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format Theorem. All regular languages are context-free languages. Proof. We show that for any FA, there is a CFG such that the language generated by the grammar is the same as the language accepted by the FA. By constructive algorithm. Input: a finite automaton. Output: a CF grammar. Dr. Nejib Zaguia CSI3104-W06
a X X Y a Chapter 13: Grammatical Format • The alphabet of terminals is the alphabet of the FA. • Nonterminals are the state names. (The start state is renamed S.) • For every edge, create a production: X aY X aX • For every final state, create a production: • X Λ Dr. Nejib Zaguia CSI3104-W06
b a,b b a b b a a b a a S → S → M → S → S → M → F → F → F S– F+ a M b Chapter 13: Grammatical Format Example: S aM | bS M aF | bS F aF | bF | Λ babbaaba S bS baM babS babbSbabbaMbabbaaF babbaabF babbaabaF babbaaba Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format Definition: A semiword is a sequence of terminals (possibly none) followed by exactly one nonterminal. (terminal)(terminal)…(terminal)(Nonterminal) Definition: A CFG is a regular grammar if all of its productions have the form: Nonterminal semiword or Nonterminal word (a sequence of terminals or Λ) Dr. Nejib Zaguia CSI3104-W06
wy wq Nx Np Nz + Chapter 13: Grammatical Format Theorem. All languages generated by regular grammars are regular. Proof. By constructive algorithm. We build a transition graph. • The alphabet of the transition graph is the set of terminals. • One state for each nonterminal. The state named S is the start state. We add one final state +. • Transitions: Nx wyNz Np wq Dr. Nejib Zaguia CSI3104-W06
aa L (aa+bb)* – + bb wq Np + Chapter 13: Grammatical Format Example: S aaSS bbSS L Alternative Algorithm: Np wq whenever wq is not Λ For each transition of the form NΛ, we mark the state for N with +.. Dr. Nejib Zaguia CSI3104-W06
aa,bb aa,bb ab ba X – ab L ba + Chapter 13: Grammatical Format Example S aaS | bbS|abX | baX | Λ X aaX | bbX | abS | baS EVEN-EVEN Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format • If the empty word is in the language, a production of the form NΛ(called a Λ-production) is necessary. • The existence of a production of the form N Λ does not necessarily mean that Λ is a part of the language. S aXS aX Λ Theorem. Let L be a language generated by a CFG. There exists a CFG without productions of the form X Λ such that: • If L, L is generated by the new grammar. • If L, all words of L except for are generated by the new grammar. Dr. Nejib Zaguia CSI3104-W06
Proof: For each N Λ and all X smt1 N smt2 include production X smt1 smt2 and eliminate N Λ. • Example: S aSa | bSb | Λ • L-{Λ} generated with • S aSa | bSb | aa | bb Dr. Nejib Zaguia CSI3104-W06
Problem: Creating new Λ productions • Example: S a | Xb | aYa X Y | Λ Y b | X • S a | Xb | b | aYa X Y Y b | X | Λ • Definition: N nullable if N … Λ • In example, X and Y are nullable Y XΛ • Modified replacement rule • Include X smt1 smt2 for each • X smt1 N smt2 and N nullable, • Delete all (old and new) Λ-productions • Ex: S a | Xb | aYa | b | aa X Y Y b | X • How to find all nullable nonterminals in a CFG? Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format Definition. A production of the form: Nonterminal Nonterminalis called a unit production. Theorem. Let L be a language generated by a CFG that has no Λ-productions. Then there is another CFG without Λ -productions and without unit productions that generates L. Proof: A B B S1 | S2 |.. Replace with A S1 | S2 | … Does it always work ? Example: S A | bb A B | b Dr. Nejib Zaguia CSI3104-W06
Modified elimination rule • Whenever A X1 X2 … B introduce • A S1|S2|S3 | …for all B S1|S2|S3 |.. • And eliminate all unit productions • Example: S A | bb A B | b B S | a • S bb | b | a A b | a | bb B a | bb | b Dr. Nejib Zaguia CSI3104-W06
Theorem. Let L be a language generated by a CFG. Then there exists another CFG that generates all words of L (except Λ) such that all the productions are of the form: Nonterminal sequence of Nonterminals Nonterminal one terminal • Proof: Introduce Aa Bb … and replace a, b,… with A, B,.. in all productions • Example: SX|XaY|aSb|b XYY|b YaY|aaX • becomes? Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format Definition. A CFG in is said to be in Chomsky Normal Form (CNF) if all the productions have the form: Nonterminal (Nonterminal)(Nonterminal) Nonterminal terminal Theorem. Let L be a language generated by a CFG. There there is another grammar which is in CNF that generates all the words of L (except Λ). Dr. Nejib Zaguia CSI3104-W06
CNF construction • Proof: following previous theorem, one can: • Eliminate all Λ-productions • Eliminate all unit productions • Reduce to X X1X2…Xn and Xterminal • Replace first type, if n>2, with • XX1R1 R1X2R2 R2X3R3 … Rn-3Xn-2Rn-2 Rn-2 Xn-1Xn Dr. Nejib Zaguia CSI3104-W06
Chapter 13: Grammatical Format Definition. In a sequence of terminals and nonterminals in a derivation (called a working string), if there is at least one nonterminal, then the first one is called the leftmost nonterminal. Definition.A leftmost derivation is a derivation where at each step, a production is applied to the leftmost nonterminal in the working string. Example.S aSX | b X Xb | a S aSX aaSXX aabXX aabXbX aababX aababa Dr. Nejib Zaguia CSI3104-W06
S S X Y X Y X X Y Y X X Y Y X X X X a a a b b a a a b b Chapter 13: Grammatical Format S XY X XX | a Y YY | b Example aaabb Derivation I Derivation II Dr. Nejib Zaguia CSI3104-W06
S X Y X X Y Y X X a a a b b Chapter 13: Grammatical Format S XY XXY aXY aXXY aaXY aaaY aaaYY aaabY aaabb Dr. Nejib Zaguia CSI3104-W06
S X Y X X Y Y X X a a a b b Chapter 13: Grammatical Format S XY XXY XXXY aXXY aaXY aaaY aaaYY aaabY aaabb Dr. Nejib Zaguia CSI3104-W06
S ) ( S S S ) ( p S S S S ~ q p Chapter 13: Grammatical Format Theorem. Any word that is in the language generated by a CFG has a leftmost derivation. Proof: do a preorder traversal in the derivation tree Example.S (S) | S S | ~S | p | q S (S) (S S) (p S) (p (S)) (p (S S)) (p (~S S)) (p (~p S)) (p (~p q)) Dr. Nejib Zaguia CSI3104-W06