1 / 34

YACC no more

YACC no more. Integrating parsers, interpreters and compilers into your application. Sriram Srinivasan (“Ram”). This is he. Sriram Srinivasan One of the core engineers of the WebLogic app server Wrote the first commercially available EJB implementation Wrote the TP engine in the WLS

niles
Download Presentation

YACC no more

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. YACC no more Integrating parsers, interpreters and compilers into your application Sriram Srinivasan (“Ram”) Session # 2221

  2. This is he • Sriram Srinivasan • One of the core engineers of the WebLogic app server • Wrote the first commercially available EJB implementation • Wrote the TP engine in the WLS • Author: “Advanced Perl Programming” (O’reilly) Session # 2221 Beginning

  3. Why this talk? • Quest for higher level programming patterns • More productive / faster / maintainable etc… • Integrating compilers, parsers, interpreters into your application Session # 2221 Beginning

  4. Embeddable Parsers Case Study: Configuration Data • JDK parsers for configuration data • java.util.Properties, XML, regex library • java.util.Properties • Limited to “property = value” format • Takes care of comments, multi-line values, quotes #app server properties connectionPoolName = testPool numThreads = 10 … p = new Properties().load(inputStream) Session # 2221 Middle

  5. XML parsers • Good for structured, hierarchical data • DOM (Document Object Model) parser • Converts an entire XML document into a corresponding tree of Nodes. • SAX (Simple API for XML) • Callback class extends DefaultHandler • Supplies methods for startDocument(…), startElement(…), endElement(…) etc. Session # 2221 Middle

  6. Adding code to data • Problem: We want to add add macros and expressions to our properties. numThreads = numProcessors # Ensure that connection pool is smaller than # thread pool. connectionPoolSize = min(numThreads – 2, 1) • This requires an expression evaluator Session # 2221 Middle

  7. Embeddable interpreters • Plethora of free, high quality interpreters available • BeanShell (Java-like syntax) • Rhino (JavaScript) • Jython (Python in Java) • Kawa (Scheme in Java) • When embedded, flow of control easily goes from java to interpreter to back. • Command-line shell always included Session # 2221 Middle

  8. BeanShell • Expressions identical to java • Types are inferred dynamically add( a, b ) { return a + b; } sum = add(1, 2); // 3 str = add("Web", "Logic"); // "WebLogic" Session # 2221 Middle

  9. Embedding BeanShell • Instead of writing code to parse the properties file, just eval it! • Comments should be “// … ”, not “# … • Each property definition line should end in “;” import bsh.Interpreter; Interpreter i = new Interpreter(); i.set("foo", 5); i.eval("bar = foo*10"); System.out.println("bar = "+ i.get("bar")); i.eval(new FileReader("config.properties")); Integer n = i.get("connectionPoolSize"); Session # 2221 Middle

  10. BeanShell features • Strict java expression syntax • no class declarations • Loose convenience syntax b = new java.awt.Button(); b.label = "Yo" // eqvt. to b.setLabel("Yo") h = new Hashtable(); h{"spud"} = "potato"; // Swing stuff b = new JButton("My Button"); f = new JFrame("My Frame"); f.getContentPane().add(b, "Center"); f.pack(); f.show(); Session # 2221 Middle

  11. Rhino • Free ECMAScript interpreter from Mozilla • Slightly more cumbersome to embed than BeanShell • Contains bytecode compiler that can be called from within java • Closures • Regex support built-in. Good for text manipulation Session # 2221 Middle

  12. Case study: Command pattern • Undo/Redo in an editor function insertCommand(text) { this.pos = buf.pos buf.insert(text) this.len = text.length this.undo = function () { buf.moveTo(this.pos) buf.erase(this.len); } undoStack.push(this); } new insertCommand("foo") undoStack.pop().undo() Session # 2221 Middle

  13. Python • Python (Java implementation is "Jython") • powerful high-level language • Compiles to bytecode. • True scripting language • Can extend java classes • Static compilation and standalone execution Session # 2221 Middle

  14. More case studies • Embedded expressions • Spreadsheet formulae • Customizable GUIs • Macro facility, keyboard mapping • Remote agents • Monitoring • Performance through partial evaluation Session # 2221 Middle

  15. Case Study: Remote Agents • Example: Test Agents • Can upload script to each agent to launch processes, control them locally. • Jython is well-suited for this kind of task • Example: Scriptable IMAP mail server • "All messages that contain this regex, make a copy in this folder" Session # 2221 Middle

  16. Case Study: Monitoring • SNMP model: Obtain attributes from each node over the network, do calculation • Alternatively, upload script to each node, and let it return the result • Conserves network bandwidth • Can insert any kind of probe • Study application data structures • Application-specific profiling Session # 2221 Middle

  17. Case Study: Performance • Partial evaluation can yield substantial performance benefits • Object - RDBMS adaptors • Code generator studies class and db schema • Omits unnecessary conversions, null checks • Vector dot product dp = a[0]*b[0] + a[1]*b[1] + a[2]*b[2]; // But if 'a' is fixed {16,0,4} … dp = b[0] << 4 + b[2] << 2 Session # 2221 Middle

  18. Generating java • Moving from embedded interpreters to generating java source • Example: JSP. • Convert template to java, compile and dynamically load • BEA/WebLogic's weblogic.dtdc • Converts XML DTD to a high performance SAX parser tuned to that DTD Session # 2221 Middle

  19. Generating code with Doclets • javadoc is a general purpose parser javadoc –doclet ListClass foo.java • ListClass.start() called with a hierarchy of *Doc nodes import com.sun.javadoc.*; public class ListClass { public static boolean start(RootDoc root) { ClassDoc[] classes = root.classes(); for (int i = 0; i < classes.length; ++i) { System.out.println(classes[i]); } return true; } • Arbitrary tags can be introduced at any level Session # 2221 Middle

  20. Case study: iContract • Pattern: doclet expressions converted to annotated java code /** * Ensure that argument is always > 0* @pre f >= 0.0** Ensure that the function produces the sqrt * within a * @post Math.abs((return * return) - f) < 0.001*/ public float sqrt(float f) { ... } Session # 2221 Middle

  21. Case Study: EJBGen /** * @ejbgen:entity * ejb-name = AccountEJB-OneToMany * data-source-name = demoPool * table-name = Accounts */ abstract public class AccountBean implements EntityBean { /** * @ejbgen:cmp-field column = acct_id * @ejbgen:primkey-field * @ejbgen:remote-method transaction-attribute = Required */ abstract public String getAccountId(); Session # 2221 Middle

  22. Generating bytecode • Example: WebLogic RMI adaptors • Sometimes, some facilities are available only in bytecode (goto's!) • Example: fast string matching • Given a search string, encode the state machine into bytecode • Worth it if the same pattern is going to be used many times • Virus scanners • Searching genome sequences Session # 2221 Middle

  23. 0 1 0 1 S0 S1 S2 S3 S4 S5 1 0 1 0 0 1 Example: String matching • Problem: match "10100" • Convert to a state machine • Each state encodes a succesful prefix match Session # 2221 Middle

  24. String matching (contd.) • If only goto were allowed in java … • But, goto's are allowed in bytecode! try { //buf is the buffer to be searched int i = -1; s0: i++; if (buf[i] != '1') goto s0; s1: i++; if (buf[i] != '0') goto s1; s2: i++; if (buf[i] != '1') goto s0; s3: i++; if (buf[i] != '0') goto s1; s4: i++; if (buf[i] != '0') goto s3; s5: i++; return i-5; } catch (ArrayIndexOutOfBoundsException e) { return -1; } Session # 2221 Middle

  25. String matching (contd.) • Using an assembler like jasmin iconst_m1 istore_1 S0: ;; i++; if a[i] != '1' goto S0; iinc 1 1 ; i++ aload_0 ; load a[i] iload_1 caload bipush 49 ; load '1' if_icmpne S0 ; if .. goto S0 S1: ;; i++; if a[i] != '0' goto S1 iinc 1 1 aload_0 iload_1 caload bipush 48 if_icmpne S1 Session # 2221 Middle

  26. Custom languages • Craft a language that fits the context you are working in • Avoid XML ugliness: SRML (Simple Rule Markup) • Instead of "if s.purchaseAmount > 100 …" <simpleCondition className="ShoppingCart" objectVariable="s"> <binaryExp operator="gt"> <field name="purchaseAmount"/> <constant type="float" value="100"/> </binaryExp> </simpleCondition> Session # 2221 Middle

  27. Antlr Introduction • Antlr: A recursive descent parser with configurable lookahead (LL(k) parser) • Much, much simpler than lex/yacc • Yacc error messages are cryptic, tough for non-CS types to understand • Even generated code easy to understand • Includes tree building and recognition • No such facility in yacc • Lexer, parser and tree recognizer phase have similar syntax Session # 2221 Middle

  28. Antlr • Example: hierarchical property list • A list consists of name value pairs • Names are identifiers, values are numbers or lists ( a 200 b (c 10 d 20) ) Session # 2221 Middle

  29. Antlr (contd.) class LispLexer extends Lexer; ID : ('a' .. 'z')+; NUM: ('0' .. '9')+; LP : '('; RP : ')'; class LispParser extends Parser; list : LP (nameValuePair)+ RP; nameValuePair : ID value ; value : NUM | list; Session # 2221 Middle

  30. Antlr (contd.) • Adding code, arguments, return values nameValuePair returns [NVP ret=null] {Object v;} : t:ID v=value {ret = new NVP(t.getText(),v);} ; value returns [Object ret=null] : t:NUM {ret=t.getText();} | ret=list ; Session # 2221 Middle

  31. Way out there … • Configurable hardware • New circuits on the fly • Intentional programming • Code not represented as a stream of characters Session # 2221 Middle

  32. Summary • Run-time evaluation gives you a lot of power • Other languages add features (e.g. closures) to java • Lots of simple, free, quality parsers, interpreters • Produce custom java source or byte code for performance • Roll your own domain-specific language with ANTLR or javacc. • Yacc No More. Session # 2221 End

  33. References • Doclets • Doclet tools: www.doclet.com • EJBGen: www.beust.com, Cedric Beust • Icontract: www.reliable-systems.com, Reto Kramer • Languages, interpreters • Beanshell: www.beanshell.org • Rhino: www.mozilla.org/rhino • Python: www.python.org, www.jython.org • ANTLR: www.antlr.org • More … flp.cs.tu-berlin.de/~tolk/vmlanguages.html • SRML: xml.coverpages.org/srml.html Session # 2221 End

  34. References (contd.) • Bytecode manipulation: • Jasmin: mrl.nyu.edu/~meyer/jasmin/ • Jikes Bytecode toolkit: www.alphaworks.ibm.com/tech/jikesbt • BCEL: bcel.sourceforge.net • "Rapid" - Reconfigurable hardware • www.cs.washington.edu/research • "The death of computer languages, the birth of intentional programming", Charles Simonyi • research.microsoft.com/scripts/pubs/trpub.asp • Microsoft tech report MSR-TR-95-52 • Thinking in Patterns with Java, Bruce Eckel • www.mindview.net/Books/TIPatterns Session # 2221 End

More Related