260 likes | 274 Views
Flexible Data-binding With RelaxNGCC. Kohsuke Kawaguchi kk@kohsuke.org. What ’ s RelaxNGCC?. RelaxNGCC is: Parser generator (e.g., YACC, JavaCC, or ANTLR) Data-binding tool (e.g., JAXB, Castor, or Relaxer) Purpose To simplify XML parsing development. Before RelaxNGCC. XML document.
E N D
Flexible Data-binding With RelaxNGCC Kohsuke Kawaguchi kk@kohsuke.org
What’s RelaxNGCC? • RelaxNGCC is: • Parser generator (e.g., YACC, JavaCC, or ANTLR) • Data-binding tool (e.g., JAXB, Castor, or Relaxer) • Purpose • To simplify XML parsing development
Before RelaxNGCC XMLdocument XML Parser Hand-written SAX Handler • But writing SAX handler ... • Is hard and tiring • Takes time • Is routine and not fun So people turn their eyes to data-binding
Problems With Data-binding Tools • Impedance mismatch b/w XML and ideal OM • What does (A|B|C)* mean? • Customization is limited • Generated code is low in quality • Expose a lot of unnecessary methods
Problems With Data-binding Tools • Unable to bridge existing code and existing schema • Take time to get used to the generated code • Need to know how schemas are mapped
After RelaxNGCC AnnotatedRELAX NG Schema • Reduces development time XMLdocument XML Parser Generated SAX Handler
How RelaxNGCC Works? • By associating code and schema <element name="team"> <oneOrMore> <element name="player"> <attribute name="number"> number=<data type="int" /> </attribute> <element name="name"> name=<text /> </element> </element> </oneOrMore> </element> <element name="team">System.out.println("start"); <oneOrMore> <element name="player"> <attribute name="number"> number=<data type="int" /> System.out.print(number+":"); </attribute> <element name="name"> name=<text /> System.out.println(name); </element> </element> </oneOrMore> System.out.println("end"); </element>
Key Concepts • Anchoring data to variables Values are copied to specified variables as document gets parsed <attribute name="number"> number=<data type="int" /> </attribute> <player number="1" />
Key Concepts • Code will be also executed at the "right" moment <?xml version="1.0“?> <team> <player number="1"> <name>me</name> </player> <player number="2"> <name>you</name> </player> </team> start 1:me 2:you end
Key Concepts • Pattern blocks work like function calls passing data down and up across boundaries <start> <element name="foo">result=<ref name="body"/> </element> </start> <define name="body" c:params="int i" c:return-type="int" c:return-value="i"> <element name="bar">j=<text/> </element> </define> <start> <element name="foo">result=<ref name="body"/>(3); </element> </start> <define name="body" c:params="int i" c:return-type="int" c:return-value="i"> <element name="bar">j=<text/> i+=Integer.parseInt(j); </element> </define>
Code Generation • Each pattern block gets its own class • At runtime, new object is allocated to process new block <grammar> <define name="Foo"> ... </define> <define name="Bar"> ... </define> </grammar> Class Foo Class Bar
Code Generation • Aliases become fields • Additional methods can be defined <define name="Foo"> <cc:java-import> *** 1 *** </cc:java-import> <cc:java-body> *** 2 *** </cc:java-body> abc = <text/> </define> import x.y.z; *** 1 *** class Foo { *** 2 *** String abc; ... }
Runtime • Code used to help generated code • Just 3 classes • No runtime version dependency • Runtime receives SAX events and coordinate handlers SAX events Generated Runtime Generated SAX Handler Generated SAX Handler Generated Handlers
Runtime • Provides services to user-specified code • Retrieve Locator object • Resolve namespace prefix • Redirect sub-tree to another SAX ContentHandler
Runtime • User-defined code can be added • Added methods/fields available to handlers • Useful to keep global info SAX events Default Runtime Generated SAX Handler Generated SAX Handler Generated Handlers extend access Customized Runtime
Runtime (example) <grammar cc:runtime-type="org.acme.foo.MyRuntime"> <define name="Foo"> runtime.myFunction(); ... </define> </grammar> class MyRuntime extends NGCCRuntime { public void myFunction() { ... } }
Put in Practice • Reading XML configuration file • Extend runtime to hold Options class • Fill in the structure as you go through document <element name="config"> <oneOrMore> <element name="param"> name = <attribute name="name"/> value= <text/> </element> runtime.opt.properties.put( name,value); </oneOrMore> <attribute name="paramX"> runtime.opt.paramX = <text/> </attribute> </element> class Options { Properties properties; String paramX; } class MyRuntime : NGCCRuntime { public Options opt; }
Put in Practice • Quickly build Abstract Syntax Tree • Just use generated class hierarchy and their fields • Use cc:class to throw in extra classes <element name="config"> <cc:java-body> public Set params; </cc:java-body> <oneOrMore> p= <group cc:class="Param"> <element name="param"> name = <attribute name="name"/> value= <text/> </element> </group> params.add(p); </oneOrMore> <attribute name="paramX"> paramX = <text/> </attribute> </element>
Put in Practice • Build full-blown object model • RelaxNGCC uses itself to parse RELAX NG • Design OM without worrying about syntax • Then use RelaxNGCC to build a parser for that Good efficient parser in short time
Why RELAX NG? • Cannot write annotation like this • How can I anchor 10 values to 10 different variables? • No way! <xs:sequence> <xs:element ref="foo"/> <cc:java>...</cc:java> <xs:element ref="bar"/> </xs:sequence> <xs:element ref="foo" maxOccurs="10"/>
Why RELAX NG? • Formal model makes RelaxNGCC simple • Simpler state management • Simpler schema parsing • Uniform treatment of attributes/elements
Why RELAX NG? • Some XML Schema features don't work well • Nillable • Type substitution • Substitution group ... ironic because all those features are supposed to be for data-oriented XMLs
Loosely-coupled Systems • Type sharing makes systems tightly-coupled • Some says that's what XML is trying to avoid • Better to share the syntax w/o sharing data model • RelaxNGCC allows you to do this!
License • Compiler • GPL • Generated code, including runtime • All yours!
To Get More Information • Project web-site • http://relaxngcc.sourceforge.net/ • Contact developers • http://groups.yahoo.com/group/reldeve/ • RELAX NG Info • http://relaxng.org/ • This presentation • http://www.kohsuke.org/
End • Acknowledgement • Daisuke Okajima, the inventor of RelaxNGCC • Sun Microsystems, for allowing me to work on RELAX NG • Any Question?