1 / 29

Context-Free Languages

CS 3240 – Chapter 5. Context-Free Languages. Where Are We?. Topics. 5.1: Context-Free Grammars Derivations Derivation Trees 5.2: Parsing and Ambiguity 5.3: CFGs and Programming Languages Precedence Associativity Expression Trees. A Curious Grammar. S ➞ aaSa | λ

vicky
Download Presentation

Context-Free Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 3240 – Chapter 5 Context-Free Languages

  2. Where Are We? CS 3240 - Introduction

  3. Topics • 5.1: Context-Free Grammars • Derivations • Derivation Trees • 5.2: Parsing and Ambiguity • 5.3: CFGs and Programming Languages • Precedence • Associativity • Expression Trees CS 3240 - Context-Free Languages

  4. A Curious Grammar • S ➞ aaSa | λ • It is not right-linear or left-linear • so it is not a “regular grammar” • But it is linear • only one variable • What is it’s language? CS 3240 - Context-Free Languages

  5. A Grammar for anbn S ➝ aSb | λ Deriving aaabbb: S ⇒ aSb ⇒ aaSbb ⇒ aaaSbbb ⇒ aaabbb CS 3240 - Context-Free Languages

  6. Context-Free Grammars • Variables • aka “non-terminals” • Letters from some alphabet, Σ • aka “terminals” • Rules (“substitution rules”) • of the form V → s • where s is any string of letters and variables, or λ • Rules are often called productions CS 3240 - Context-Free Languages

  7. Sample CFGs • ancbn • anb2n • anbm, where 0 ≤ n ≤ m ≤ 2n • anbm, n ≠ m • Palindrome (start with a recursive definition) • Non-Palindrome • Equal • anbnam CS 3240 - Context-Free Languages

  8. A Grammar for Twicenb(w) = 2⋅na(w) S → aSbSbS | bSaSbS | bSbSaS | λ Trace ababbb When building CFGs, remember that the start variable (S) represents a string in the language. So, for example, if S has twice as many b’s as a’s, then so does aSbSbS, etc. CS 3240 - Pushdown Automata

  9. Derivations • A derivation is a sequence of applications of grammatical rules, eventually yielding a string in the language • A CFG can have multiple variables on the right-hand side of a rule • Giving a choice of which variable to expand first • By convention, we usually use a leftmost derivation CS 3240 - Context-Free Languages

  10. A Leftmost Derivation <S> → <NP> <VP> <NP> → the <N> <VP> → <V> <NP> <V> → sings | eats <N> → cat | song | canary <S> ⇒ <NP> <VP> ⇒ the <N> <VP> ⇒ the canary <VP> ⇒ the canary <V> <NP> ⇒ the canary sings <NP> ⇒ the canary sings the <N> ⇒ the canary sings the song “sentential forms” (aka “productions”) CS 3240 - Context-Free Languages

  11. Derivation Treesaka “Parse Trees” • A graphical representation of a derivation • The start symbol is the root • Each symbol in the right-hand side of the rule is a child node at the same level • Continue until the leaves are all terminals CS 3240 - Context-Free Languages

  12. A Derivation Tree CS 3240 - Context-Free Languages

  13. AmbiguitySection 5.2 • Note how there was only one parse tree or the string “the canary sings the song” • And only one leftmost derivation • This is not true of all grammars! • Some grammars allow choices of distinct rules to generate the same string • Or equivalently, where there is more than one parse tree for the same string • Such a grammar is ambiguous • Not easy to process programmatically CS 3240 - Context-Free Languages

  14. An Ambiguous GrammarDerivation Perspective <exp> → <exp> + <exp> | <exp> * <exp> | (<exp>) | a | b | c <exp> ⇒ <exp> + <exp> ⇒ a + <exp> ⇒ a + <exp> * <exp> ⇒ a + b * <exp> ⇒ a + b * c <exp> ⇒ <exp> * <exp> ⇒ <exp> + <exp> * <exp> ⇒ a + <exp> * <exp ⇒ a + b * <exp> ⇒ a + b * c CS 3240 - Context-Free Languages

  15. An Ambiguous GrammarParse Tree Perspective Which one is “correct”? CS 3240 - Context-Free Languages

  16. Parsing • The process of determining if a string is generated by a grammar • And often we want the parse tree • So that we know the order of operations • Top-down Parsing • Easiest conceptually • Bottom-up Parsing • Most efficient (used by commercial compilers) • We will use a simple one in Chapter 6 CS 3240 - Context-Free Languages

  17. Top-Down Parsing • Try to match a string, w, to a grammar • If there is a rule S → w, we’re done! • Fat chance :-) • Try to find rules that match the first character • A “look-ahead” strategy • This is what we do “in our heads” anyway • Repeat on the rest of the string… • Very “brute force” CS 3240 - Context-Free Languages

  18. Top-Down ParsingExample S → SS | aSb | bSa | λ Parse “aabb”: CS 3240 - Context-Free Languages

  19. Top-Down ParsingExample • S → SS | aSb | bSa | λ • Parse “aabb”: • Candidate rules: 1) S → SS, 2) S → aSb: • SS ⇒ SSS, SS ⇒ aSbS • aSb ⇒ aSSb, aSb ⇒ aaSbb • Answer: S ⇒ aSb ⇒ aaSbb ⇒ aabb (2) • Not a well-defined algorithm (yet)! CS 3240 - Context-Free Languages

  20. Parsing by Recursive Descent • A top-down parsing technique • Grammar Requirements: • no ambiguity • no lambdas • no left-recursion (e.g., A -> Ab) • … and some other stuff • Create a function for each variable • Check first character to choose a rule • Start by calling S( ) CS 3240 - Context-Free Languages

  21. Parsing anbn, n > 0, by Recursive Descent • Grammar: S -> aSb | ab • Function S: • if length == 2, check to see if it is “ab” • otherwise, consume outer‘a’ and ‘b’, then call S on what’s left • See parseanbn.py, parseanbn2.py CS 3240 - Context-Free Languages

  22. Parsing b*a by Recursive Descent • Grammar: A -> BA | aB -> bB | b • See parsebstara.cpp CS 3240 - Context-Free Languages

  23. The Problem with λ • Lambda rules can cause productions to shrink • Then they can grow, and shrink again • And grow, and shrink, and grow, and shrink… • How then can we know if the string isn’t in the language? • That is, how do we know when we’re done so we can stop and reject the string? CS 3240 - Context-Free Languages

  24. Another Problem“Unit Production Rules” • A rule of the form A → B doesn’t increase the size of the sentential form • Once again, we could spend a long time cycling through unit rules before parsing |w| • We prefer a method that always strictly grows to |w|, so we can stop and answer “yes” or “no” efficiently • So, we will removelambda and unit rules • In Chapter 6 CS 3240 - Context-Free Languages

  25. CFGs and Programming LanguagesSection 5.3 • Precedence • Associativity CS 3240 - Context-Free Languages

  26. Fixing Our Expression GrammarPrecedence • It was ambiguous because it treated all operators equally • But multiplication should have higher precedence than addition • So we introduce a new variable for multiplicative expressions • And place it further down in the rules • Because we want it to appear further down in the parse tree CS 3240 - Context-Free Languages

  27. Giving Precedence <exp> → <exp> + <mulexp> | <mulexp> <mulexp> → <mulexp> * <rootexp> | <rootexp> <rootexp> → (<exp>) | a | b | c Now only one leftmost derivation for a + b * c: <exp> ⇒<exp> + <mulexp> ⇒ <mulexp> + <mulexp> ⇒ <rootexp> + <mulexp> ⇒ a + <mulexp> ⇒ a + <mulexp> * <rootexp> ⇒ a + <rootexp> * <rootexp> ⇒ a + b * <rootexp> ⇒ a + b * c CS 3240 - Context-Free Languages

  28. Giving Precedence CS 3240 - Context-Free Languages

  29. Associativity • Derive the parse tree for a + b + c … • Note how you get (a + b) + c, in effect • Left-recursion gives left associativity • Analogously for right associativity • Exercise: • Add a right-associative power (exponentiation) operator (^, with variable <powerexp>) to the grammar with the proper precedence CS 3240 - Context-Free Languages

More Related