170 likes | 260 Views
Challenges in handling XML: performance and memory usage 15.11.2001 Sami Poikonen Republica oy. Republica Oy is Finland’s leading provider of products and services based on XML standards. Founded: 1996 Employees: 70+ (11/2001) Offices: Helsinki, Jyväskylä. TOC. DOM SAX
E N D
Challenges in handling XML: performance and memory usage 15.11.2001 Sami Poikonen Republica oy
Republica Oy is Finland’s leading provider of products and services based on XML standards. Founded: 1996 Employees: 70+ (11/2001) Offices: Helsinki, Jyväskylä
TOC • DOM • SAX • DOM or SAX or something else... • Transformations • Conclusions
Parsing XML: DOM • Document Object Model • standard API for accessing and creating xml data • tree-based • programming language indepedent • developed by W3C • whole document is read into memory • read and write
<?xml version="1.0"?> <book type="pokkari"> <title>Tuntematon sotilas</title> <author> <name first="Väinö" last="Linna"/> </author> </book> DomNode book | |-->DomNode title | | | |-->DomNode text | |-->DomNode author | |-->DomNode name
Parsing XML: SAX • Simple API for XML • API for accessing xml data • event based • programming language indepedent • not defined by W3C • application has to store fragments into memory • read only
<?xml version="1.0"?> <poem> <line>Roses are red,</line> <line>Violets are blue.</line> <line>Sugar is sweet,</line> <line>and I love you.</line> </poem> Start element: poem Start element: line End element: line Start element: line End element: line Start element: line End element: line Start element: line End element: line End element: poem
DOM or SAX or something else? • DOM: • read and write • need to move back and forth in data • document is human created • SAX: • read only • huge data or streams • data is machine generated Best of both worlds? Adaptive parsing!
Transformations • XSLT: XSL Transformations • XSLT processors are built to use DOM • XSLT to java conversion: still uses DOM • SAX based custom-made application for trasformations • Adaptive parsing with data binding?
Conclusions • When building XML applications, you have to think how will youhandle large chunks of data • Choosing between SAX and DOM is not always trivial • There are more smarter ways to parse XML also • Adaptive parsing with data binding gives a lot of needed performance into transformations • It is easy to reach the limits of XLST processing capabilities • In some cases problems handling xml streams and large files has lead to assume that its is almost impossible to handle those
Contact Information Republica Oy http://www.republica.fi/ Survontie 9 40500 Jyväskylä http://www.x-fetch.com/ Sami Poikonen Vice President, Solutions p. 040 301 1154 sami.poikonen@republica.fi