1 / 54

AUTOMATA THEORY

AUTOMATA THEORY. Chapter 05. CONTEX-FREE GRAMMERS AND LANGUAGES. Introduction. Context-free grammars (CFG) have played a central role in compiler technology since the 1960’s. They turned the implementation of parsers , ad-hoc implementation task.

wynne
Download Presentation

AUTOMATA THEORY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AUTOMATA THEORY

  2. Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES

  3. Introduction Context-free grammars (CFG) have played a central role in compiler technology since the 1960’s. They turned the implementation of parsers, ad-hoc implementation task. Parsers: functions that discover the structure of a program.

  4. An informal example Let us consider the language of palindromes. A palindrome is a string that reads the same forward and backward, such as otto, madamimadam. Let’s consider describing only the palindromes with alphabet {0,1}. EX: 0110,11011 etc.

  5. A Context-free Grammar for Palindromes P є P 0 P 1 P 0P0 P 1P1 Only for binary strings.

  6. Definition of CFG A CFG is a way of describing language by recursive rules called productions. A CFG consists of … A finite set of symbols/terminals/terminal symbols. A finite set of variables/nonterminals. A start symbol/start variable. A finite set of productions/rules.

  7. Definition of CFG (continue) Each productions consists of: the head of the production. the production symbol  The body of the production, a string of zero or more terminals and variables.

  8. Definition of CFG (continue) The four components of CFG G can be represent as follows: G = (V, T, P, S) Variables terminals Start variable productions

  9. A Context-free Grammar for Palindromes The grammar G for the palindrome is represented by.. G = ({P},{0,1},A,P) pal pal where A represents the set of five productions: • P є • P 0 • P 1 • P 0P0 • P 1P1 only for binary string

  10. Example of CFG A CFG for simple expressions where the operators ‘+’ and ‘*’ present. It allows only the letters ‘a’ and ’b’ and the digits ‘0’ and ‘1’. Every identifiers must begin with a and b which may be followed by any other string in {a,b,0,1}* G=({E,I},T,P,E) T={0,1,a,b,+,*,(,)} productions: • E I • E E+E • E E*E • E (E) • I a 6. I b 7. I Ia 8. I Ib 9. I I0 10 I I1

  11. Derivation using grammar (ab+ab0) E(E)-------------4 E(E+E)----------2 E(I+E)-----------1 E(Ib+E)---------8 E(ab+E)--------5 E(ab+I)----------1 E(ab+I0)----------9 E(ab+Ib0)--------8 E(ab+ab0)-------5 productions: E I E E+E E E*E E (E) I a 6. I b 7. I Ia 8. I Ib 9. I I0 10 I I1

  12. Example of CFG A CFG for syntactically correct infix algebraic expressions in the variables x, y and z. G=({S},T,P,S) T={x , y, z,-,+,*,/,(,)} productions: S → x S → y S → z S → S + S S → S - S S → S * S S → S / S S → ( S )

  13. Derivation using grammar productions: S → x S → y S → z S → S + S S → S - S S → S * S S → S / S S → ( S )

  14. An informal example

  15. An example of CFG

  16. An example of CFG

  17. LMD and RMD LMD (Left Most Derivation): At each step we replace the left most variable by one of its production bodies. Such a derivation is called a leftmost derivation. A derivation is leftmost by using the relations => and => for one or many steps. RMD (Right Most Derivation): At each step we replace the right most variable by one of its production bodies. Such a derivation is called a rightmost derivation. A derivation is leftmost by using the relations => and => for one or many steps. lm lm rm rm

  18. Left Most Derivation CFG: EI | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 LMD: a*(a+b00): E =>E*E lm=>I*E lm=>a*E lm=>a*(E) lm=>a*(E+E) lm=>a*(I+E) lm=>a * (a+E) lm=>a*(a+I) lm=>a*(a+I0) lm=>a*(a+I00) lm=>a*(a+b00)

  19. Right Most Derivation CFG: EI | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 RMD: a*(a+b00): E =>E*E rm=>E*(E) rm=>E*(E+E) rm=>E*(E+I) rm=>E*(E+I0) rm=>E*(E+I00) rm=>E * (E+b00) rm=>E*(I+b00) rm=>E*(a+b00) rm=>I*(a+I00) rm=>a*(a+b00)

  20. The Language of a Grammar If G(V,T,P,S) is a CFG, the language of G, denoted L(G), is the set of terminal strings that have derivations from the start symbol. That is, L(G)={w in T | S  w} If a language L is the language of some context-free grammar, then L is said to be a context-free language, or CFL. * G

  21. Parse Tree A tree representation for derivations which shows clearly has the symbols of a terminal string are grouped into substrings. Parse tree used in a compiler, data structure. In a compiler, the tree structure of the source program facilities the translation of the source program into executable code by allowing natural, recursive functions to perform this translation process. Graphical representation for a derivations.

  22. Constructing Parse Tree Let us fix on a grammar G=(V,T,P,S). The parse trees for G are trees with the following conditions: Each interior node is labeled by a variable V. Each leaf is labeled by either variable, a terminal or є. If an interior node is labeled A, and its children are labeled X1, X2………………….,Xk respectively, from the left, then A X1X2…Xk is a production.

  23. Parse Tree Example A parse tree showing the derivation of I+E from E. E E + E I

  24. Parse Tree Example (Continue..) A parse tree showing the derivation P  0110. * P P є P 0 P 1 P 0P0 P 1P1 0 P 0 1 1 P є

  25. The Yield of a Parse Tree If we look at the leaves of any parse tree and concatenate them from left, we get a string called the yield of a parse tree, which is always a string that is derived from the root variable. The yield is a terminal string. That is, all leaves are labeled either with a terminal or with є. The root is labeled by the start symbol.

  26. Parse tree showing a*(a+b00) E E E * ( E ) I E + E a I I I 0 a I 0 b

  27. Parse tree showing ( x + y ) * x - z * y / ( x + x )

  28. Parse tree showing The man read this book

  29. Inference, Derivations, and Parse Trees Parse Tree Leftmost Derivation Rightmost Derivation Derivation Recursive Inference

  30. Self Study <5.2.4> <5.2.5> <5.2.6> Theorem 5.12, 5.14, 5.18

  31. Ambiguous Grammar A grammar uniquely determines a structure for each string in its language. Not every grammar does provide unique structures. When a grammar fails to provide unique structure, it is known as ambiguous grammar. More than one derivation/parse tree.

  32. Ambiguous Grammar example Let us consider a CFG: CFG: EI | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 Expression: a + a*a LMD: E E+E I+E a+ E a+ E*E a+ I*E a+ a*E a+ a*I a+ a*a RMD: E E*E E*I E*a E+E*a E+I*a E+ a*a I+ a*a a+ a*a lm lm lm lm lm lm lm lm rm rm rm rm rm rm rm rm

  33. LMD E E E + E E * I I I a a a Fig: Trees yield a+a*a

  34. RMD E E E * E E + I I I a a a Fig: Trees yield a+a*a

  35. Removing Ambiguity from Grammar Two causes of ambiguity in the grammar : The precedence of operator is not respected. A sequence of identical operators can group either from the left or from the right.

  36. Two derivation trees for Prof. Busch - LSU

  37. take Prof. Busch - LSU

  38. Bad Tree Good Tree Compute expression result using the tree Prof. Busch - LSU

  39. The solution of the problem of enforcing precedence is to introduce several different variables. A factor- is an expression that cannot be broken apart by any adjacent operators. The only factors in our expression language are: i. Identifiers: It is not possible to separate the letters of identifier by attaching an operator. ii. Any parenthesized expression, no matter what appears inside the parenthesis. A term- is an expression that cannot be broken by the ‘+’ operator. Term is product of one or more factors. An expression-is a sum of one or more terms. Removing Ambiguity from Grammar

  40. Let us consider a CFG: CFG: EI | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 An unambiguous expression grammar : I  a| B| Ia |Ib |I0 | I1 F I| (E) T F| T*F E T| E+T Removing Ambiguity from Grammar

  41. Unambiguous Grammar example CFG: I  a| B| Ia |Ib |I0 | I1 F I| (E) T F| T*F E T| E+T Expression: a + a*a Derivation: E E+T T+T F+ T I+ T a+ T a+ T*F a+ F*F a+ I*I  a+ a*a

  42. Inherent Ambiguity Topic 5.4.4 L={anbncmdm|n>=1, m>=1}U{anbmcmdm| n>=1, m>=1}

  43. Unambiguous Grammar example E E+T T+T F+ T I+ T a+ T a+ T*F a+ F*F a+ I*I  a+ a*a E E T + T F T * I F F a I I Fig: Trees yield a+a*a a a

  44. Example of CFG A CFG for generates prefix expressions with operands x and y and binary operators +, -, *. productions: E → x E → y E → +EE E → -EE E → *EE

  45. Example of CFG Design A CFG for the set of all strings with an equal number of a’s and b’s. productions: S→ aSbS | bSaS | Є

  46. Example of CFG Design A CFG on the string length that no string in L(G) has ba as a substring. productions: S→ aS | Sb | a| b

  47. Example of CFG Design A CFG for the regular expression 0*1(0+1)*. productions: S→ A1B A → 0A | Є B → 0B | 1B| Є

  48. Example of CFG

  49. Application of CFG CFG- a way to describe natural language Two of these uses: 1. Parsers 2. Markup language (HTML,XML) Parsers: A parse tree-as a graphical representation for derivations. Parsing is the process of determining if a string of tokens can be generated by a grammar. A complier may not actually construct a parse tree. However a parser must be capable of constructing such tree. A parser can be constructed for any grammar. The CFG is an essential concept for the implementation of parsers.

  50. YACC Parser Generator Tools such as YACC take a CFG as input and produce a parser Exp: Id {…} | Exp ‘+’ Exp {…} | Exp ‘*’ Exp {…} | ‘(’ Exp ‘)’ {…} Id: ‘a’ {…} |’b’ {…} |Id ‘a’ {…} |Id ‘b’ {…} |Id ‘0’ {…} |Id ‘1’ {…} ;

More Related