1 / 11

Roman Ivantsov, MCSD Software Architect, Tyler Technologies, Eden Division tylertech

Parsing with IRONY. http://www.codeplex.com/irony. Roman Ivantsov, MCSD Software Architect, Tyler Technologies, Eden Division www.tylertech.com roman.ivantsov@tylertech.com. Agenda Introduction and background Demo - Expression Grammar Inside Irony – internal architecture overview

sharicej
Download Presentation

Roman Ivantsov, MCSD Software Architect, Tyler Technologies, Eden Division tylertech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parsing with IRONY http://www.codeplex.com/irony Roman Ivantsov, MCSD Software Architect, Tyler Technologies, Eden Division www.tylertech.com roman.ivantsov@tylertech.com

  2. Agenda • Introduction and background • Demo - Expression Grammar • Inside Irony – internal architecture overview • Future plans

  3. Introduction • Name • Motivation - upgrade compiler construction technology. • Goal – make it easier and make it faster • Feature set – whatever Lex/Yacc can do and more • Method • All in c#: embedded Domain-Specific Language (DSL) for grammar specifications • No code generation: one-for-all LALR(1) Parser • Scanner construction from standard and custom token recognizers

  4. Demo • Grammar for Arithmetic Expressions • Grammar Explorer • Sample Grammars

  5. Irony Parsing Architecture Parsing Pipeline Source AST Tree Scanner Tokens Token Filter* Token Filter* Tokens+- Parser Terminals Parser State Graph Language Grammar Grammar Data (at startup) * - optional; standard or custom filter

  6. Grammar • Base class for language grammars • c# operator overloads defined in BnfElement class; Terminal and NonTerminal are derived from BnfElement • Helper methods for Kleene operators * + ?; list constructor with optional delimiters

  7. Scanner • OOP-style micro-framework for building custom scanners from standard and custom Terminals • Terminal is a token recognizer • Highly-optimized for performance • Ignores whitespace; new-line, indent and unindent tokens are created in token filter if necessary Source Tokens Scanner Identifier Language Grammar Number StringLiteral

  8. Token Filters • Sit between Scanner and Parser • Perform language-specific tasks: • Code outlining - create NewLine, Indent, Unindent, EndOfStatement tokens • Commented-out blocks • Macro-expansion • Checking matching pairs of braces/parenthesis • Handling XML documentation

  9. Parser • LALR(1) algorithm controlled by State Graph built at startup • AST node type determined by NodeType property of NonTerminal in language grammar. • Customization • Custom AST nodes • Factory methods for node creation • Public events and overridable methods in grammar

  10. Highlights • It works! • Easier to use, expect shorter implementation time • High performance: 10..20 K lines/second • High code reuse – reusable terminals, token filters, AST nodes • No generated code => higher code quality + easier to maintain

  11. Future Development • Completing features for 1.0 release (est. Feb 2008) • Create a set of standard AST nodes • Extending to LALR(1.5) parser – with extra token preview • Create basic infrastructure for building interpreters • Build Runtime • Extensible Object Model (Piumarta, Warth) • Type model (yes, wrappers!) • Simple scripting language(s) for testing

More Related