1 / 32

Polyglot An Extensible Compiler Framework for Java

Polyglot An Extensible Compiler Framework for Java. Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University. Language extension. Language designers often create extensions to existing languages e.g., C++, PolyJ, GJ, Pizza, AspectJ, Jif, ArchJava, ESCJava, Polyphonic C#, ...

Download Presentation

Polyglot An Extensible Compiler Framework for Java

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PolyglotAn Extensible Compiler Framework for Java Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University

  2. Language extension • Language designers often create extensions to existing languages • e.g., C++, PolyJ, GJ, Pizza, AspectJ, Jif, ArchJava, ESCJava, Polyphonic C#, ... • Want to reuse existing compiler infrastructure as much as possible • Polyglot is a framework for writing compiler extensions for Java

  3. Requirements • Language extension • Modify both syntax and semantics of the base language • Not necessarily backward compatible • Goals: • Easy to build and maintain extensions • Extensibility should be scalable • No code duplication • Compilers for language extensions should be open to further extension

  4. extension compiler 1.0 base compiler 1.0 copy & modify bug fixes & upgrades bug fixes & upgrades (again) copy & base compiler 2.0 extension compiler 2.0 modify (again) Rejected approaches • In-place modification • Macro languages • Limited to syntax extensions • Semantic checks after macro expansion

  5. Polyglot • Base compiler is a complete Java front end • 25K lines of Java • Name resolution, inner class support, type checking, exception checking, uninitialized variable analysis, unreachable code analysis, ... • Can reuse and extend through inheritance

  6. Scalable extensibility • Most compiler passes are sparse: Changes to the compiler should be proportional to changes in the language. AST Nodes Passes

  7. Non-scalable approaches Easy to add or modify Passes AST nodes Visitors pass as AST node method (“naive OO”) Using Polyglot

  8. Base Polyglot compiler Java source Java target Java parser AST rewriting passes Code generator Ext source Java target Ext parser AST rewriting passes Code generator Ext2 source Java target Ext2 parser AST rewriting passes Code generator Polyglot architecture

  9. Architecture details • Parser written using PPG • Adds grammar inheritance to Java CUP • AST nodes constructed using a node factory • Decouples node types from implementation • AST rewriting passes: • Each pass lazily creates a new AST • From naive OO: traverse AST invoking a method at each node • From visitors: AST traversal factored out

  10. Example: PAO • Primitive types as subclasses of Object • Changes type system, relaxes Java syntax • Implementation: insert boxing and unboxing code where needed HashMap m; m.put(“two”, 2); int v = (int) m.get(“two”); HashMap m; m.put(“two”, new Integer(2)); int v = ((Integer) m.get(“two”)).intValue();

  11. PAO implementation • Modify parser and type-checking pass to permit e instanceof int • Parser changes with PPG: include “java.cup” drop { rel_expr ::= rel_expr INSTANCEOF ref_type } extend rel_expr ::= rel_expr:a INSTANCEOF type:b {: RESULT = node_factory.Instanceof(a, b); :} • Add one new pass to insert boxing and unboxing code

  12. Implementing a new pass • Want to extend Node interface with rewrite() method • Default implementation: identity translation • Specialized implementations: boxing and unboxing • Mixin extensibility: extensions to a base class should be inherited by subclasses Node If Add

  13. Inheritance is inadequate Node PaoNode If Add PaoIf PaoAdd

  14. Inheritance is inadequate Node PaoNode

  15. Extension objects Use composition to mixin methods and fields into AST node classes Node PaoExt null If Add PAO extension objects; installed into all nodes by node factory

  16. Extension objects Extension objects have their own ext field to leave extension open Node PaoExt If Add null

  17. Method invocation • A method may be implemented in the node or in any one of several extension objects. • Extension should call node.ext.ext.typeCheck() • Base compiler should call: node.typeCheck() • Cannot hardcode the calls PaoExt Node null

  18. Delegate objects • Each node & extension object has a del field • Delegate object implements same interface as node or ext • Directs call to appropriate method implementation • Ex:node.del.typeCheck() • Ex:node.ext.del.rewrite() • Run-time overhead < 2% JavaDel { node.ext.ext.typeCheck() } { node.codeGen() } PaoExt Node null

  19. Scalable extensibility • To add a new pass: • Use an extension object to mixin default implementation of the pass for the Node base class • Use extension objects to mixin specialized implementations as needed • To change the implementation of an existing pass • Use delegate object to redirect to method providing new implementation • To create an AST node type: • Create a new subclass of Node • Or, mixin new fields to existing node using an extension object

  20. Polyglot family tree Polyglot base (Java) PAO parameterized types JMatch covariant return Coffer PolyJ Jif Jif/split

  21. Results • Can build small extensions in hours or days • 10% of base code is interfaces and factories

  22. Related work • Other extensible compilers • e.g., CoSy, SUIF • e.g., JastAdd, JaCo • Macros • e.g., EPP, Java Syntax Extender, Jakarta • e.g., Maya • Visitors • e.g., staggered visitors, extensible visitors

  23. Conclusions • Several Java extensions have been implemented with Polyglot • Programmer effort scales well with size of difference with Java • Extension objects and delegate objects provide scalable extensibility • Download from: http://www.cs.cornell.edu/projects/polyglot

  24. Acknowledgments http://www.cs.cornell.edu/projects/polyglot

  25. Questions?

  26. Mixin extensibility Inheritance does not provide mixin extensibility: when a base class is extended, subclasses should inherit the changes Node If Add

  27. Other Polyglot features • Quasi-quoting library • Useful for translation from extension language AST to base language or intermediate language AST qqStmt(“if (%e.relabelsTo(%e)) %s; else %s;”, new Object[] { L, Li, then_body, else_body }); • Automatic separate compilation • Serialize type information and store in the AST • Encoded into the class file via javac • Extracted from class file using reflection • Data-flow analysis framework

  28. PAO rewriting • rewrite(ts) called for each AST node: class PaoExt extends Ext { Node rewrite(PaoTypeSystem ts) { return node(); } } class PaoInstanceofExt extends PaoExt { Node rewrite(PaoTypeSystem ts) { Instanceof e = (Instanceof) node(); Type rtype = e.compareType(); // e.g., “e instanceof int”  “e instanceof Integer” if (rtype.isPrimitive()) return e.compareType(ts.boxedType(rtype)); else return n; } }

  29. Node factories • Each extension has a node factory (nf) • To create a node of type T, call method nf.T() • T() may return an extension-specific subclass of T • T() attaches the extension and delegate objects to T via a call to extT() • Mixin extensibility: if T is a subclass of S, then extT() will call extS()

  30. Results • Can build small extensions in hours, days • 10% of base compiler code is interfaces and factories

  31. Why not output bytecode? • Wanted to be able to read the output • The symmetry is satisfying • Limitations of Java as a target language • Scoping rules sometimes make it difficult to output Java code, especially with inner classes • Lack of goto can make generated control flow inefficient

  32. Semantic checking Name resolution Translation

More Related