1 / 19

Recursive Descent Parsing (with combinators )

Recursive Descent Parsing (with combinators ). Greg Morrisett. Last Time. We saw how to use combinators to build not just a lexer , but a parser. The only difference is that parsers are generally recursive . And that recursion can get us into trouble. For Example.

tavia
Download Presentation

Recursive Descent Parsing (with combinators )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recursive Descent Parsing(with combinators) Greg Morrisett

  2. Last Time • We saw how to use combinators to build not just a lexer, but a parser. • The only difference is that parsers are generally recursive. • And that recursion can get us into trouble.

  3. For Example Suppose we have a grammar that looks like this: intlist -> INT intlist| <eps>

  4. Using our Combinators intlist -> INT intlist| <eps> let int_p (ts:token list) = match tswith| (INT i)::rest -> [(i,rest)] | _ -> [] let rec intlist_p = funts -> ((int_p $ intlist_p) % cons ++ eps)ts

  5. A Manual Parser intlist -> INT intlist| <eps> letrecintlist_pts =matchtswith | (INT i)::rest -> let (ints,ts’) = intlist_p rest in (i::ints, ts’) | _ -> ([], ts)

  6. For Example But what if we instead wrote: intlist -> intlist INT | <eps> Now the grammar is left-recursive since in one case, we run into the non-terminal intlist before we see any terminal.

  7. Using our Combinators intlist -> intlist INT | <eps> let int_p (ts:token list) = match tswith| (INT i)::rest -> [(i,rest)] | _ -> [] let rec intlist_p = funts -> ((intlist_p $ int_p) % cons_end ++ eps)ts

  8. A Manual Parser intlist -> intlist INT | <eps> letrecintlist_pts =let (ints, ts’) = intlist_ptsin matchts’with | (INT i)::rest -> (ints @ [i], rest) | _ -> ([], ts) Oops! That’s definitely going to loop forever. So we want to avoid writing grammars that are left recursive.

  9. Another Example exp -> INT | exp ‘+’ exp letrecexp_pts = (int_p ++ (exp_p $ tokPLUS $ exp_p) % (function ((i,_),j) -> i+j)))ts

  10. Inlining “++” letrecexp_pts = (int_pts) @ ((exp_p $ tok PLUS $ exp_p) % (function ((i,_),j) -> i+j) ts)

  11. Inlining “$” and “%” letrecexp_pts = (int_pts) @ let s1 = exp_ptsinfold_right (function (i,ts1) a -> ...)

  12. Note – infinite loop! letrecexp_pts = (int_pts) @ let s1 = exp_ptsinfold_right (function (i,ts1) a -> ...)

  13. Refactoring the Grammar exp -> INT | exp ‘+’ exp exp -> INT | INT ‘+’ exp This accepts the same strings, but is no longer left-recursive.

  14. With our Combinators exp -> INT | INT ‘+’ exp let rec exp_pts = int_p ++ (int_p $ tok PLUS $ exp_p) % (function ((i,_),j) -> i+j)

  15. Unwinding the definitions let rec exp_pts = (int_pts) ++ let s1 = int_ptsinfold_right (function (i,ts2) ->match ts2 with | PLUS::ts3 -> let s2 = exp_p ts2 in ... By the time we do the recursive call, the list of tokens is smaller.

  16. Let’s Scale Up exp -> INT | exp ‘+’ exp | exp ‘*’ exp In addition to the problem with left-recursion, we have the problem that we’ll get multiple parse results for an expression like “3 + 2 * 6”.

  17. Getting Rid of Left Recursion exp -> INT | INT ‘+’ exp | INT ‘*’ exp let rec exp_p = int_p ++ (int_p $ tok PLUS $ exp_p) % ... (int_p $ tok TIMES $ exp_p) % ...

  18. Grouping exp -> term | term ‘+’ expterm -> INT | INT * exp

  19. Grouping exp -> term | term ‘+’ expterm -> INT | INT ‘*’ term letrec term ts = (INT ++ (INT $ tok TIMES $ term) % ...) tsandexpts = (term ++ (term $ tok PLUS $ exp) % ...) ts

More Related