1 / 40

LR-Grammars

LR-Grammars. LR(0), LR(1), and LR(K). Deterministic Context-Free Languages. DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton (DPDA) Many programming languages can be described by means of DCFLs. Prefix and Proper Prefix. Prefix (of a string)

kerry
Download Presentation

LR-Grammars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LR-Grammars LR(0), LR(1), and LR(K)

  2. Deterministic Context-Free Languages • DCFL • A family of languages that are accepted by a Deterministic Pushdown Automaton (DPDA) • Many programming languages can be described by means of DCFLs

  3. Prefix and Proper Prefix • Prefix (of a string) • Any number of leading symbols of that string • Example: abc • Prefixes: , a, ab, abc • Proper Prefix (of a string) • A prefix of a string, but not the string itself • Example: abc • Proper prefixes: , a, ab

  4. Prefix Property • Context-Free Language (CFL) L is said to have the prefix property whenever w is in L and no proper prefix of w is in L • Not considered a serve restriction • Why? • Because we can easily convert a DCFL to a DCFL with the prefix property by introducing an endmarker

  5. Suffix and Proper Suffix • Suffix (of a string) • Any number of trailing symbols • Proper Suffix • A suffix of a string, but not the string itself

  6. Example Grammar • This is the grammar that will be used in many of the examples: • S’  Sc • S  SA | A • A  aSb | ab

  7. LR-Grammar • Left-to-right scan of the input producing a rightmost derivation • Simply: • L stands for Left-to-right • R stands for rightmost derivation

  8. LR-Items • An item (for a given CFG) • A production with a dot anywhere in the right side (including the beginning and end) • In the event of an -production: B   • B · is an item

  9. Example: Items • Given our example grammar: • S’  Sc, S  SA|A, A  aSb|ab • The items for the grammar are: S’·Sc, S’S·c, S’Sc· S·SA, SS·A, SSA·, S·A, SA· A·aSb, Aa·Sb, AaS·b, AaSb·, A·ab, Aa·b, Aab·

  10. Some Notation • * = 1 or more steps in a derivation • *rm = rightmost derivation • rm = single step in rightmost derivation

  11. Right-Sentential Form • A sentential form that can be derived by a rightmost derivation • A string of terminals and variables  is called a sentential form if S* 

  12. More terms • Handle • A substring which matches the right-hand side of a production and represents 1 step in the derivation • Or more formally: • (of a right-sentential form  for CFG G) • Is a substring  such that: • S *rm w • w =  • If the grammar is unambiguous: • There are no useless symbols • The rightmost derivation (in right-sentential form) and the handle are unique

  13. Example • Given our example grammar: • S’  Sc, S  SA|A, A  aSb|ab • An example right-most derivation: • S’  Sc  SAc  SaSbc • Therefore we can say that: SaSbc is in right-sentential form • The handle is aSb

  14. More terms • Viable Prefix • (of a right-sentential form for ) • Is any prefix of  ending no farther right than the right end of a handle of . • Complete item • An item where the dot is the rightmost symbol

  15. Example • Given our example grammar: • S’  Sc, S  SA|A, A  aSb|ab • The right-sentential form abc: • S’ *rm Ac  abc • Valid prefixes: • A  ab for prefix ab • A  ab for prefix a • A  ab for prefix  • Aab is a complete item,  Ac is the right-sentential form for abc

  16. LR(0) • Left-to-right scan of the input producing a rightmost derivation with a look-ahead (on the input) of 0 symbols • It is a restricted type of CFG • 1st in the family of LR-grammars • LR(0) grammars define exactly the DCFLs having the prefix property

  17. Computing Sets of Valid Items • The definition of LR(0) and the method of accepting L(G) for LR(0) grammar G by a DPDA depends on: • Knowing the set of valid items for each prefix  • For every CFG G, the set of viable prefixes is a regular set • This regular set is accepted by an NFA whose states are the items for G

  18. Continued • Given an NFA (whose states are the items for G) that accepts the regular set • We can apply the subset construction to this NFA and yield a DFA • The DFA whose state is the set of valid items for 

  19. NFA M • NFA M recognizes the viable prefixes for CFG • M = (Q, V  T, , q0, Q) • Q = set of items for G plus state q0 • G = (V, T, P, S) • Three Rules • (q0,) = {S| S is a production} • (AB,) = {B| B is a production} • Allows expansion of a variable B appearing immediately to the right of the dot • (AX, X) = {AX} • Permits moving the dot over any grammar symbol X if X is the next input symbol

  20. Theorem 10.9 • The NFA M has property that (q0, ) contains A iff A is valid for  • This theorem gives a method for computing the sets of valid items for any viable prefix • Note: It is an NFA. It can be converted to a DFA. Then by inspecting each state it can be determine if it is a valid LR(0) grammar

  21. Definition of LR(0) Grammar • G is an LR(0) grammar if • The start symbol does not appear on the right side of any productions •  prefixes  of G where A is a complete item, then it is unique • i.e., there are no other complete items (and there are no items with a terminal to the right of the dot) that are valid for 

  22. Facts we now know: • Every LR(0) grammar generates a DCFL • Every DCFL with the prefix property has a LR(0) grammar • Every language with LR(0) grammar have the prefix property • L is DCFL iff L has a LR(0) grammar

  23. DPDA’s from LR(0) Grammars • We trace out the rightmost derivation in reverse • The stack holds a viable prefix (in right-sentential form) and the current state (of the DFA) • Viable prefixes: X1X2…Xk • States: s1, s2,…,sk • Stack: s0X1s1…Xksk

  24. Reduction • If sk contains A • Then A is valid for X1X2…Xk •  = suffix of X1X2…Xk • Let •  = Xi+1…Xk • w such that X1…Xkw is a right-sentential form.

  25. Reduction Continued • There is a derivation: • S *rm X1…XiAw rm X1…Xkw • To obtain the right-sentential form (X1…Xkw) in a right derivation we reduce  to A • Therefore, we pop Xi+1…Xk from the stack and push A onto the stack

  26. Shift • If sk contains only incomplete items • Then the right-sentential form (X1…Xkw) cannot be formed using a reduction • Instead we simply “shift” the next input symbol onto the stack

  27. Theorem 10.10 • If L is L(G) for an LR(0) grammar G, then L is N(M) for a DPDA M • N(M) = the language accepted by empty stack or null stack

  28. Proof • Construct from G the DFA D • Transition function: recognizes G’s prefixes • Stack Symbols of M are • Grammar Symbols of G • States of D • M has start state q and other states used to perform reduction

  29. We know that: • If G is LR(0) then • Reductions are the only way to get the right-sentential form when the state of the DFA (on the top of the stack) contains a complete item • When M starts on input w it will construct a right-most derivation for w in reverse order

  30. What we need to prove: • When a shift is called for and the top DFA state on the stack has only incomplete items then there are no handles • (Note: if there was a handle, then some DFA state on the stack would have a complete item)

  31. Suppose  state A (complete item) • Each state is put onto the top of the stack • It would then immediately be reduced to A • Therefore, a complete item cannot possibly become buried on the stack

  32. Proof continued • The acceptance of G occurs when the top of the stack contains the start symbol • The start symbol by definition of LR(0) grammars cannot appear on the right side of a production • L(G) always has a prefix property if G is LR(0)

  33. Conclusion of Proof • Thus, if w is in L(G), M finds the rightmost derivation of w, reduces w to S, and accepts • If M accepts w, then the sequence of right-sentential forms provides a derivation of w from S • N(M) = L(G)

  34. Corollary of Theorem 10.10 • Every LR(0) grammar is unambiguous • Why? • The rightmost derivation of w is unique • (Given the construction we provided)

  35. LR(1) Grammars • LR grammar with 1 look-ahead • All and only deterministic CFL’s have LR(1) grammars • Are greatly important to compiler design • Why? • Because they are broad enough to include the syntax of almost all programming languages • Restrictive enough to have efficient parsers (that are essentially DPDAs)

  36. LR(1) Item • Consists of an LR(0) item followed by a look-ahead set consisting of terminals and/or the special symbol $ • $ = the right end of the string • General Form: • A  , {a1, a2, …, an} • The set of LR(1) items forms the states of a viable prefix by converting the NFA to a DFA

  37. A grammar is LR(1) if • The start symbol does not appear on the right side of any productions • The set of items, I, valid for some viable prefix includes some complete item A, {a1,…,an} then • No ai appears immediately to the right of the dot in any item of I • If B, {b1,…,bk} is another complete item in I, then ai  bj for any 1  i  n and 1  j  k

  38. Accepting LR(1) language: • Similar to the DPDA used with LR(0) grammars • However, it is allowed to use the next input symbol during it’s decision making • This is accomplished by appending a $ to the end of the input and the DPDA keeps the next input symbol as part of the state

  39. LR(1) Rules for Reduce/Shift • If the top set of items has a complete item A, {a1, a2, …, an}, where A  S, reduce by A if the current input symbol is in {a1, a2, …, an} • If the top set of items has an item S, {$}, then reduce by S and accept if the current symbol is $ (i.e., the end of the input is reached) • If the top set of items has an item AaB, T, and a is the current input symbol, then shift

  40. Regarding the Rules • Guarantees that at most one of the rules will be applied for any input symbol or $ • Often for practicality the information is summarized into a table • Rows: sets of items • Columns: terminals and $

More Related