250 likes | 369 Views
Transforming XML The XSLT Language. Michael H. Kay. Topics. XSLT as a language. Usability and fitness for purpose. Academic interest. Open Source and Investment. Success in the market. Success in the market. Success in the market. Technology & Architecture. SAXON as a product.
E N D
Transforming XMLThe XSLT Language Michael H. Kay
Topics XSLTas a language Usability and fitness for purpose Academicinterest Open Sourceand Investment Successin the market Successin the market Successin the market Technology&Architecture SAXON as a product SAXON as a product SoftwareEngineering
Why did XML happen? • The Web needed something better than HTML • Data Interchange needed something better than CSV files • SGML was around, subsetting it was easier than reinventing • It had to be cheap, and it was • By luck, there was no competition
XSLTExtensible Stylesheet Language - Transformations • A declarative language for transforming XML • Widely used in • publishing applications • messaging applications • anywhere else where XML is found • XSLT 1.0 (1999) was widely implemented • XSLT 2.0 (2007) is popular with users, but there are few products
What kind of language is XSLT? • Declarative, functional • Rule-based • Uses XML syntax • Data model is “abstract XML” tree • Type system based on XML Schema • A two-language system: XSLT+XPath
Template Rules <xsl:template match=“bibliography”> <h1><a name=“bibl”>Bibliography</a></h1> <dl> <xsl:apply-templates/> </dl> </xsl:template> <xsl:template match=“bibl-entry”> <dt> <xsl:value-of select=“@ref”/> </dt> <dd> <xsl:apply-templates/> </dd> </xsl:template>
Style sheet Parsing Serialization Stylesheet Tree Source Result Document Document SourceTree ResultTree TransformationProcess The XSLT Processing Model
Pros and ConsDeclarative, Functional PRO • Optimizable • Safe • Robust • Productive CON • Script kiddies hate it • Slow • Recursion is mind-numbing
Pros and ConsRule-based PRO • Great for text and semi-structured data • Potential for change CON • Script kiddies hate it • Makes static analysis hard
Pros and ConsUses XML Syntax PRO • Templates: “fill in the blanks” • Common infrastructure (editors, parsers) • Extensibility CON • Verbose • Ugly
Pros and ConsXML Tree Data Model PRO • Abstracts away from the lexical XML detail • But it’s still distinctively XML CON • Too abstract for some • In-memory assumption • Inadequate for complex algorithms
The XPath Axes (1) parent preceding-sibling following-sibling child self
The XPath Axes (2) ancestor preceding following descendant
Pros and ConsUse of XML Schema PRO • Everyone uses XML Schema • Type safety • Optimization • Better diagnostics CON • Everyone hates XML Schema • Strong typing is for wimps
Success Factors for XSLT • People needed it • There wasn’t much competition • Good open-source implementations appeared early • High level of spec conformance • Adequate performance • Browser support • Endorsement/credibility
Architectureof an XSLT Processor Stylesheet Parser Builder Compiler CompiledStylesheet ResultDocument SourceDocument XPath Outputter Serializer Parser Builder
Factors driving Performance • Tree model: searching and matching • Pipelining, streaming • Static code optimization • Use of schema • Code generation • Basic good programming • Engineering Methodology
XSLT and XQuery XQuery 1.0 overlaps in capability: • Much smaller language, less power • Clean design, easier to learn • Backed by database vendors • who have money • Backed by academics • who want money • More oriented to data than documents • which is where the money is
New in XSLT 2.0 • Grouping • Regular Expressions • Schema-awareness • Functions • Multiple output • Date/time handling • ... and much more
Where are we today? • XSLT 2.0 came out in Jan 2007 • 2½ implementations (Saxon, Altova) • Highly popular with users • but not with non-users • The big vendors have failed to produce products • not for want of trying
Saxon (XSLT and XQuery) 10 years 300 emails/month 500downloadsper day £xxxK revenue 1 developer 300K test cases 180K LOC 10 bugs/month
Engineering Model • “Alpine mountaineering” • Agile, high-speed, high-risk • Small teams, fast decision making • Importance of tooling • IDEs • test automation • Ship frequently • low half-life for bugs • Support community
Business Model • Open Source has driven out the profit • Only the low-cost operators are able to make money • IBM, Oracle, Microsoft, Intel are failing to deliver product • Is this good for users?
Summary: XSLT • XML concentrates on information interchange • This creates a need for transformation • Many processors available • Excellent conformance to standards • Good performance and reliability • Often open source and/or free • Role of XSLT • Styling XML for presentation • Transforming XML for application interworking • Middle-tier business logic
Wider lessons • Open source changes everything • Ultra-low-cost vendors have a significant advantage • But this is reducing investment and reducing quality (on some measures) • The future is uncertain...