1 / 189

Module 28

Module 28. Context Free Grammars Definition of a grammar G Deriving strings and defining L(G) Context-Free Language definition. Context-Free Grammars. Definition. Definition. A context-free grammar G = (V, S , S, P) V: finite set of variables (nonterminals)

eddy
Download Presentation

Module 28

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Module 28 • Context Free Grammars • Definition of a grammar G • Deriving strings and defining L(G) • Context-Free Language definition

  2. Context-Free Grammars Definition

  3. Definition • A context-free grammar G = (V, S, S, P) • V: finite set of variables (nonterminals) • S: finite set of characters (terminals) • S: start variable • element of V • role is similar to that of q0 for an FSA or NFA • P: finite set of grammar rules or production rules • Syntax of a production • variable --> string of variables and terminals

  4. English Context-Free Grammar • ECFG = (V, S, S, P) • V = {<sentence>, <noun phrase>, <verb phrase>, ... } • people sometimes use < > to delimit variables • In this course, we generally will use capital letters to denote variables • S = {a, b, c, ..., z, ;, ,, ., ...} • S = <sentence> • P = { <sentence> --> <noun phrase> <verb phrase> <pct>, <noun phrase> --> <article> <adj> <noun>, ...}

  5. {aibi | i>0} CFG • ABG = (V, S, S, P) • V = {S} • S = {a, b} • S = S • P = {S --> aSb, S --> ab} or S --> aSb | ab • second format saves some space

  6. Context-Free Grammars Deriving strings, defining L(G), and defining context-free languages

  7. Defining -->, ==> notation • First: --> notation • This is used to define the productions of a grammar • S --> aSb | ab • Second: ==>G notation • This is used to denote theapplication of a production rule from a grammar G • S ==>ABG aSb ==>ABG aaSbb ==>ABG aaabbb • We say that string S derives string aSb (in one step) • We say that string aSb derives string aaSbb (in one step) • We say that string aaSbb derives string aaabbb (in one step) • We often omit the grammar subscript when the intended grammar is unambiguous

  8. Defining ==> continued • Third: ==>kG notation • This is used to denote kapplications of production rules from a grammar G • S ==>2ABG aaSbb • We say that string S derives string aaSbb in two steps • aSb ==>2ABG aaabbb • We say that string aSb derives string aaabbb in two steps • We often omit the grammar subscript when the intended grammar is unambiguous

  9. Defining ==> continued • Fourth: ==>*G notation • This is used to denote 0 or moreapplications of production rules from a grammar G • S ==>*ABG S • We say that string S derives string S in 0 or more steps • S ==>*ABG aaSbb • We say that string S derives string aaSbb in 0 or more steps • aSb ==>*ABG aaSbb • We say that string aSb derives string aaSbb in 0 or more steps • aSb ==>*ABG aaabbb • We say that string aSb derives string aaabbb in 0 or more steps • We often omit the grammar subscript when the intended grammar is unambiguous

  10. Defining derivations * • Derivation of a string x • The complete step by step derivation of a string x from the start variable S • Key fact: each step in a derivation makes only one application of a production rule from G • Example: Derivation of string aaabbb using ABG • S ==>ABG aSb ==>ABG aaSbb ==>ABG aaabbb • Example 2: AG= (V, S, S, P) where P = S -->SS | a • Deriving string aaa • S ==> SS ==> Sa ==> SSa ==> aSa ==> aaa

  11. Defining L(G) * • Generating strings • If S ==>G* x, then grammar G generates string x • Note G generates strings which contain terminals and nonterminals • aSb contains nonterminals and terminals • S contains only nonterminals • aaabbb contains only terminals • L(G) • The set of strings over S generated by grammar G • Note we only consider terminal strings generated by G • {aibi | i > 0} = L(ABG) • {ai | i > 0} = L(AG)

  12. Context-Free Languages * • Context-Free Languages • A language L is a context-free language (CFL) iff there exists a CFG G such that L(G) = L • Results so far • {ai | i > 0} is a CFL • One CFG G such that L(G) = this language is AG • Note this language is also regular • {aibi | i > 0} is a CFL • One CFG G such that L(G) = this language is ABG • Note this language is NOT regular

  13. Example * • Let BAL = the set of strings over {(,)} in which the parentheses are balanced • Prove that BAL is a CFL • To prove this, you need to come up with a CFG BALG such that L(BALG) = BAL • BALG = (V, S, S, P) • V = {S} • S = {(, )} • S = S • P = ? • Give derivations of ((( ))) and ( )(( )) with your grammar

  14. Module 29 • Parse/Derivation Trees • Leftmost derivations, rightmost derivations • Ambiguous Grammars • Examples • Arithmetic expressions • If-then-else Statements • Inherently ambiguous CFL’s

  15. Context-Free Grammars Parse Trees Leftmost/rightmost derivations Ambiguous grammars

  16. Parse Tree • Parse/derivation trees are structured derivations • The structure graphically illustrates semantic information about the string • Formalization of concept we encountered in regular languages unit • Note, what we saw before were not exactly parse trees as we define them now, but they were close

  17. S S S ( S ) ( S ) l ( S ) l Parse Tree Example • Parse tree for string ( )(( )) and grammar BALG • BALG = (V, S, S, P) • V = {S}, S = {(, )}, S = S • P = S --> SS | (S) | l • One derivation of ( )(( )) • S ==> SS ==> (S)S ==> ( )S ==> ( )(S) ==> ( )((S)) ==> ( )(( )) • Parse tree

  18. Syntax: draw a unique arrow from each variable to each character that is a direct child of that variable A line instead of an arrow is ok The derived string can be read in a left to right traversal of the leaves Semantics The tree graphically illustrates the nesting structure of the string of parentheses S S S ( S ) ( S ) l ( S ) l Comments about Example *

  19. There is more than one derivation of the string ( )(( )). S ==> SS ==> (S)S ==>( )S ==> ( )(S) ==> ( )((S)) ==> ( )(( )) S ==> SS ==> (S)S ==> (S)(S) ==> ( )(S) ==> ( )((S)) ==> ( )(( )) S ==> SS ==> S(S) ==> S((S)) ==> S(( )) ==> (S)(( )) ==>( )(( )) Leftmost derivation Leftmost variable is always expanded Which one of the above is leftmost? Rightmost derivation Rightmost variable is always expanded Which one of the above is rightmost? S S S ( S ) ( S ) l ( S ) l Leftmost/Rightmost Derivations

  20. Fix a string and a grammar Any derivation corresponds to a unique parse tree Any parse tree can correspond to many different derivations Example The one parse tree corresponds to all three derivations Unique mappings For any parse tree, there is a unique leftmost/rightmost derivation that it corresponds to S S S ( S ) ( S ) l ( S ) l Comments • S ==> SS ==> (S)S ==>( )S ==> ( )(S) ==> ( )((S)) ==> ( )(( )) • S ==> SS ==> (S)S ==> (S)(S) ==> ( )(S) ==> ( )((S)) ==> ( )(( )) • S ==> SS ==> S(S) ==> S((S)) ==> S(( )) ==> (S)(( )) ==>( )(( ))

  21. Example * • S ==> SS ==> SSS ==> (S)SS ==> ( )SS ==> ( )S ==> ( ) • The above is a leftmost derivation of the string ( ) from the grammar BALG • Draw the corresponding parse tree • Draw the corresponding rightmost derivation • S ==> (S) ==> (SS) ==> (S(S)) ==> (S( )) ==> (( )) • The above is a rightmost derivation of the string (( )) from the grammar BALG • Draw the corresponding parse tree • Draw the corresponding leftmost derivation

  22. Ambiguous Grammars Examples: Arithmetic Expressions If-then-else statements Inherently ambiguous grammars

  23. Ambiguous Grammars • A grammar G is ambiguous if there exists a string x in L(G) with two or more distinct parse trees • (2 or more distinct leftmost/rightmost derivations) • Example • Grammar AG is ambiguous • String aaa in L(AG) has 2 rightmost derivations • S ==> SS ==> SSS ==> SSa ==> Saa ==> aaa • S ==> SS ==> Sa ==> SSa ==> Saa ==> aaa

  24. 2 Simple Examples • Grammar BALG is ambiguous • String ( ) in L(BALG) has >1 leftmost derivation • S ==> (S) ==> ( ) • S ==> (S) ==> (SS) ==>(S) ==>( ) • Give another leftmost derivation of ( ) from BALG • Grammar ABG is NOT ambiguous • Consider any string x in {aibi | i > 0} • There is a unique parse tree for x

  25. Legal Arithmetic Expressions • Develop a grammar MATHG = (V, S, S, P) for the language of legal arithmetic expressions • S = {0, 1, +, *, -, /, (, )} • Strings in the language include • 0 • 10 • 10*11111+100 • 10*(11111+100) • Strings not in the language include • 10+ • 11++101 • )(

  26. Grammar MATHG1 • V = {E, N} • S = {0, 1, +, *, -, /, (, )} • S = E • P: • E --> N | E+E | E*E | E/E | E-E | (E) • N --> N0 | N1 | 0 | 1

  27. E --> N | E+E | E*E | E/E | E-E | (E)N --> N0 | N1 | 0 | 1 MATHG1 is ambiguous • Come up with two distinct leftmost derivations of the string 11+0*11 • E ==> E+E ==> N+E ==> N1+E ==> 11+E ==> 11+E*E ==> 11+N*E ==> 11+0*E ==> 11+0*N ==> 11+0*N1 ==> 11+0*11 • E ==> E*E ==> E+E*E ==> N+E*E ==> N1+E*E ==> 11+E*E ==> 11+N*E ==> 11+0*E ==> 11+0*N ==> 11+0*N1 ==>11+0*11 • Draw the corresponding parse trees

  28. E ==> E+E ==> N+E ==> N1+E ==> 11+E ==> 11+E*E ==> 11+N*E ==> 11+0*E ==> 11+0*N ==> 11+0*N1 ==> 11+0*11 E ==> E*E ==> E+E*E ==> N+E*E ==> N1+E*E ==> 11+E*E ==> 11+N*E ==> 11+0*E ==> 11+0*N ==> 11+0*N1 ==>11+0*11 E E * E + E E E + N E N * E N 1 N N N 1 N N N 0 1 1 0 N 1 1 1 1 Corresponding Parse Trees E E

  29. E E E E * E + E E E + N E N * E N 1 N N N 1 N N N 0 1 1 0 N 1 1 1 1 Parse Tree Meanings Note how the parse trees captures the semantic meaning of string 11+0*11. More specifically, what number does the first parse tree represent? What number does the second parse tree represent?

  30. Implications • Two interpretations of string 11+0*11 • 11+(0*11) = 11 • (11+0)*11 = 1001 • What if a line in a program is • MSU_Tuition = 11+0*11; • What is MSU_Tuition? • Depends on how the expression 11+0*11 is parsed. • This is not good. • Ambiguity in grammars is undesirable, particularly if the grammar is used to develop a compiler for a programming language like C++. • In this case, there is an unambiguous grammar for the language of arithmetic expressions

  31. If-Then-Else Statements • A grammar ITEG = (V, S, S, P) for the language of legal If-Then-Else statements • V = (S, BOOL) • S = {adv<80, adv>50, grade=3.5, grade=3.0, if, then, else} • S = S • P: • S --> if BOOL then S else S | if BOOL then S |grade=3.5 | grade=3.0 • BOOL --> adv<80 | adv>50

  32. S --> if BOOL then S |grade=3.5 | grade=3.0 | if BOOL then S else S BOOL --> adv<80 | adv>50 ITEG is ambiguous • Come up with two distinct leftmost derivations of the string • if adv<80 then if adv>50 then grade=3.5 else grade=3.0 • S ==>if BOOL then S else S ==> if adv<80 then S else S ==> if adv<80 then if BOOL then S else S ==> if adv<80 then if adv>50 then S else S ==> if adv<80 then if adv>50 then grade=3.5 else S ==> if adv<80 then if adv>50 then grade=3.5 else grade=3.0 • S ==>if BOOL then S ==> if adv<80 then S ==> if adv<80 then if BOOL then S else S ==> if adv<80 then if adv>50 then S else S ==> if adv<80 then if adv>50 then grade=3.5 else S ==> if adv<80 then if adv>50 then grade=3.5 else grade=3.0 • Draw the corresponding parse trees

  33. S ==>if BOOL then S else S ==> if adv<80 then S else S ==> if adv<80 then if BOOL then S else S ==> if adv<80 then if adv>50 then S else S ==> if adv<80 then if adv>50 then grade=3.5 else S ==> if adv<80 then if adv>50 then grade=3.5 else grade=3.0 S ==>if BOOL then S ==> if adv<80 then S ==> if adv<80 then if BOOL then S else S ==> if adv<80 then if adv>50 then S else S ==> if adv<80 then if adv>50 then grade=3.5 else S ==> if adv<80 then if adv>50 then grade=3.5 else grade=3.0 if S if B then S else B then S S adv<80 if grade=3.0 adv<80 if else B then S B then S adv>50 adv>50 grade=3.5 grade=3.5 grade=3.0 Corresponding Parse Trees S S

  34. Parse Tree Meanings S S if B then S if S B then S else S adv<80 if else B then S adv<80 if grade=3.0 B then S adv>50 grade=3.5 grade=3.0 adv>50 grade=3.5 If you receive a 90 on advanced points, what is your grade? By parse tree 1 By parse tree 2

  35. Implications • Two interpretations of string • if adv<80 then if adv>50 then grade=3.5 else grade=3.0 • Issue is which if-then does the last ELSE attach to? • This phenomenon is known as the “dangling else” • Answer: Typically, else binds to NEAREST if-then • In this case, there is an unambiguous grammar for handling if-then’s as well as if-then-else’s

  36. Inherently ambiguous CFL’s • A CFL L is inherently ambiguous iff for all CFG’s G such that L(G) = L, G is ambiguous • Examples so far • None of the CFL’s we’ve seen so far are inherently ambiguous • While the CFG’s we’ve seen ambiguous, there do exist unambiguous CFG’s for those CFL’s. • Later result • There exist inherently ambiguous CFL’s • Example: {aibjck | i=j or j=k or i=j=k} • Note i=j=k is unnecessary, but I added it here for clarity

  37. Summary • Parse trees illustrate “semantic” information about strings • Ambiguous grammars are undesirable • This means there are multiple parse trees for some string • These strings can be interpreted in multiple ways • There are some heuristics people use for taking an ambiguous grammar and making it unambiguous, but this is not the focus of this course • There are some inherently ambiguous CFL’s • Thus, the above heuristics do not always work

  38. Module 30 • EQUAL language • Designing a CFG • Proving the CFG is correct

  39. EQUAL language Designing a CFG

  40. EQUAL • EQUAL is the set of strings over {a,b} with an equal number of a’s and b’s • Strings in EQUAL include • aabbab • bbbaaa • abba • Strings in {a,b}* not in EQUAL include • aaa • bbb • aab • ababa

  41. Designing a CFG for EQUAL • Think recursively • Base Case • What is the shortest possible string in EQUAL? • Production Rule:

  42. Recursive Case • Recursive Case • Now consider a longer string x in EQUAL • Since x has length > 0, x must have a first character • This must be a or b • Two possibilities for what x looks like • x = ay • What must be true about relative number of a’s and b’s in y? • x = bz • What must be true about relative number of a’s and b’s in z?

  43. Case 1: x=ay • x = ay where y has one extra b • What must y look like? • Some examples • b • babba • aabbbab • aaabbbb • Is there a general pattern that applies to all of the above examples? • More specifically, show how we can decompose all of the above strings y into 3 pieces, two of which belong to EQUAL. • Some of these pieces might be the empty string l

  44. Decomposing y • y has one extra b • Possible examples • b, babba, aabbbab, aaabbbb • Decomposition • y = ubv where • u and v both have an equal number of a’s and b’s • Decompose the 4 strings above into u, b, v • lbl,aabbbab, lbabba, aaabbbbl

  45. Implication • Case 1: x=ay • y has one extra b • Case 1 refined: x=aubv • u, v belong to EQUAL • Production rule for this case?

  46. Case 2: x=bz • Case 2: x=bz • z has one extra a • Case 2 refined: x=buav • u, v belong to EQUAL • Production rule for this case?

  47. Final Grammar • EG = (V, S, S, P) • V = {S} • S = {a,b} • S = S • P:

  48. EQUAL language Proving CFG is correct

  49. Is our grammar correct? • How do we prove our grammar is correct? • Informal • Test some strings • Review logic behind program (CFG) design • Formal • First, show every string derived by EG belongs to EQUAL • That is, show L(EG) is a subset of EQUAL • Second, show every string in EQUAL can be derived by EG • That is, show EQUAL is a subset of L(EG) • Both proofs will be inductive proofs • Inductive proofs and recursive algorithms go well together

  50. L(EG) subset of EQUAL • Let x be an arbitrary string in L(EG) • What does this mean? • S ==>*EG x • Follows from definition of x in L(EG) • We will prove the following • If S ==>1EG x, then x is in EQUAL • If S ==>2EG x, then x is in EQUAL • If S ==>3EG x, then x is in EQUAL • If S ==>4EG x, then x is in EQUAL • ...

More Related