390 likes | 641 Views
What's New in XSLT 2.0. Jeni Tennison http://www.jenitennison.com. Overview. Grouping Function Definitions Result Documents Multiple Result Documents Output Serialisation Temporary Trees Sequences Text Parsing Typing. Grouping. Perennial requirement
E N D
What's New in XSLT 2.0 Jeni Tennison http://www.jenitennison.com
Overview • Grouping • Function Definitions • Result Documents • Multiple Result Documents • Output Serialisation • Temporary Trees • Sequences • Text Parsing • Typing What's New in XSLT 2.0
Grouping • Perennial requirement • usually use Muenchian Method (keys) in XSLT 1.0 • XSLT 2.0 has <xsl:for-each-group> • select attribute identifies items to group • grouping by value calculates grouping key • group-by groups all items • group-adjacent groups adjacent items • grouping in sequence identifies start/end of group • group-starting-with identifies start of group • group-ending-with identifies end of group • Use current-group() to get members of current group, current-grouping-key() to get value for current group What's New in XSLT 2.0
Grouping by Value <paper> <title>XML and PDF in Publishing Workflows</title> <author>Myers, Charles</author> </paper> <paper> <title>On the Way to XML</title> <author>Parsons, Jonathan</author> <author>Caisley, Phil</author> </paper> <author> <name>Caisley, Phil</name> <paper>On the Way to XML</paper> </author> <author> <name>Myers, Charles</name> <paper>XML and PDF in Publishing Workflows</paper> </author> <author> <name>Parsons, Jonathan</name> <paper>On the Way to XML</paper> </author> What's New in XSLT 2.0
Grouping by Value <xsl:for-each-group select="paper" group-by="author"> <xsl:sort select="current-grouping-key()" /> <author> <name> <xsl:value-of select="current-grouping-key()" /> </name> <xsl:for-each select="current-group()"> <paper> <xsl:value-of select="title" /> </paper> </xsl:for-each> </author> </xsl:for-each-group> What's New in XSLT 2.0
Grouping in Sequence • Can use to do grouping by position: • Or to "levitate" structure from flat documents • group the content of paragraphs into text or block-level elements <xsl:for-each-group select="paper" group-starting-with="paper[position() mod 10 = 1]"> <xsl:result-document href="papers{position()}.html"> … <xsl:apply-templates select="current-group()" /> … </xsl:result-document> </xsl:for-each-group> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • No more Muenchian Grouping! • easier to create indexes • easier to create summaries/roll-ups • easier to create paginated documents • Much easier to convert from flat to hierarchical structures • processing XHTML to DocBook (or XHTML2.0) What's New in XSLT 2.0
Function Definitions • Use XSLT code to create new functions • no facility to use scripting languages such as JavaScript • similar to <func:function> from EXSLT • Function must be in a namespace • All parameters are required • but can have multiple definitions with different numbers of arguments • supports optional arguments, not polymorphic functions • Need parameters for context item, position • can't default to using context node as argument • Body of function is result of function • similar to named templates • use <xsl:sequence> instead of <func:result> What's New in XSLT 2.0
Function Definition <xsl:function name="str:align"> <xsl:param name="string" /> <xsl:param name="padding" /> <xsl:param name="alignment" /> … </xsl:function> <xsl:function name="str:align"> <xsl:param name="string" /> <xsl:param name="padding" /> <xsl:sequence select="str:align($string, $padding, 'left')" /> </xsl:function> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • General replacement for named templates • Particular use where XSLT code can't be used: • creating a value to sort by using <xsl:sort> • creating a value to index by using <xsl:key> • creating a value to group by using <xsl:foreach-group> • carrying out complex tests on nodes, for use in match patterns in templates or keys <xsl:template match="*[html:is-heading(.)]"> ... </xsl:template> What's New in XSLT 2.0
Multiple Result Documents • Many XSLT 1.0 processors have extension elements to create multiple output documents • XSLT 2.0 has <xsl:result-document>: <xsl:for-each select="section"> <xsl:result-document href="{@id}.html"> <xsl:apply-templates select="." mode="html" /> </xsl:result-document> </xsl:for-each> What's New in XSLT 2.0
Multiple Result Documents • Document is associated with href URI • accessible via API • should enable client-side support • Make it easier to create: • paginated output • page per chapter • page per 20 records • pages using HTML frames • supplementary files • CSS stylesheets • SVG graphics What's New in XSLT 2.0
Output Serialisation • Several changes to <xsl:output>: • output definitions are named • referenced from <xsl:result-document> elements • additional xhtml output method • extra attributes to control HTML/XHTML serialisation: • escape-uri-attributes controls URI-escaping of attributes • include-content-type controls addition of <meta> element • normalize-unicode attribute provides Unicode normalisation • character substitution provides an alternative for disable-output-escaping What's New in XSLT 2.0
Character Substitution • Map of characters-to-strings • On serialisation, each character in a text node or attribute is substituted for the relevant string • Simple use is to force use of entities <xsl:character-map name="html"> <xsl:output-character character=" " string="&nbsp;" /> … </xsl:character-map> <xsl:output use-character-maps="html" /> <eg>blah blah</eg> <eg>blah blah</eg> What's New in XSLT 2.0
Character Substitution • Complex use is to create non-well-formed output • use characters from private use areas to represent illegal sequences of characters <xsl:character-map name="jsp"> <!-- JSP start --> <xsl:output-character character="" string="<%" /> <!-- JSP end --> <xsl:output-character character="" string="%>" /> </xsl:character-map> @ page language="java"  <%@ page language="java" %> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • Much more control over output • control over automatic serialisation • addition of <meta> element • Unicode normalisation • control over character serialisation • which entities get used • what form of character references • No more d-o-e? • character substitution is better • works in attribute values • supported by all (serialising) processors • persists through variables • d-o-e is still easier to use What's New in XSLT 2.0
Temporary Trees and RTFs • XSLT 1.0 had Result Tree Fragments • created when use content of <xsl:variable> • tree that couldn't be accessed with location path • most processors have xxx:node-set() extension function • convert result tree fragment to node tree • In XSLT 2.0, have temporary trees • can copy in same way as RTFs • can access without using extension function What's New in XSLT 2.0
Example Temporary Tree <xsl:variable name="menus"> <menu name="File"> <menuItem name="New..." shortcut="Ctrl-N" /> <menuItem name="Open..." shortcut="Ctrl-O" /> <menuItem name="Save..." shortcut="Ctrl-S" /> ... </menu> ... </xsl:variable> 1.0 <xsl:value-of select="exsl:node-set($menus)/menu /menuItem[@shortcut = $shortcut]/@name" /> 2.0 <xsl:value-of select="$menus/menu /menuItem[@shortcut = $shortcut]/@name" /> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • Break up complex transformations into several steps • filter, sort, annotate nodes in early steps • later steps are easier to write • Create lookup tables • translate from codes or numbers to labels • similar to arrays or matrices • Iteratively process a document until it fulfils some test • add content until a document is valid What's New in XSLT 2.0
Sequences in XPath 2.0 • New fundamental type in XPath 2.0 • everything is a sequence • similar to node sets, but… • ordered • allow duplicates • can contain atomic values as well as nodes • Sequences are flat • for structured data, use XML • Singleton sequences are the same as the single value they contain What's New in XSLT 2.0
Using Sequences in XSLT 2.0 • Iterate over a sequence • Create a text node from a sequence <xsl:for-each select="1 to 3"> <tr><td colspan="4" /></tr> </xsl:for-each> <tr><td colspan="4" /></tr> <tr><td colspan="4" /></tr> <tr><td colspan="4" /></tr> <xsl:value-of select="author/surname" separator=", " /> Thompson, Tobin What's New in XSLT 2.0
Creating Sequences in XSLT 2.0 • Every sequence of instructions creates a sequence of items • When a sequence is added to a node: • atomic values are converted to text nodes • spaces added between atomic values • nodes are copied to create children sequence of any items sequence of new nodes children of node What's New in XSLT 2.0
Temporary Trees • Variables can be set using select attribute or using content • When setting value using content: • if as attribute not present, create temporary tree • if as attribute present, create sequence <xsl:variable name="tree"> 42 </xsl:variable> <xsl:variable name="sequence" as=“xs:integer"> 42 </xsl:variable> What's New in XSLT 2.0
Creating Sequences in XSLT 2.0 • New instruction <xsl:sequence> • adds existing nodes or new atomic values to a sequence • Select the line item with highest subtotal <xsl:variable name="max-expenditure" as="element(lineitem)"> <xsl:for-each select="lineitem"> <xsl:sort select="@price * @quantity" order="descending" /> <xsl:if test="position() = 1"> <xsl:sequence select="." /> </xsl:if> </xsl:for-each> </xsl:variable> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • Less need for recursive templates • use integer sequences to iterate a number of times • use <xsl:sequence> to build node sequences by iteration rather than recursion • Less need for temporary elements • atomic values don't need to be wrapped in an element in order to be passed around in a list What's New in XSLT 2.0
Text Parsing • XPath 2.0 has regular expression support: • match(string, regex, flags?) returns true if a regular expression matches a substring • replace(string, regex, replacement, flags?) returns the string with all occurrences of the regular expression replaced using the replacement string • tokenize(string, regex, flags?) returns a sequence of strings created by splitting the string on every occurrence of the regular expression • Can do more complex regular expression processing using XSLT 2.0 What's New in XSLT 2.0
Analysing Strings • XSLT 2.0 has <xsl:analyze-string> instruction • select attribute selects string • regex attribute holds regular expression • string split into a sequence of matching and non-matching substrings • processed in turn by: • <xsl:matching-substring> for matching • <xsl:non-matching-substring> for non-matching What's New in XSLT 2.0
Example String Analysis <poem> Mary had a little lamb, Its fleece was white as snow; And everywhere that Mary went The lamb was sure to go. </poem> <xsl:template match="poem"> <poem> <xsl:analyze-string select="." regex="\S.*" flags="m"> <xsl:matching-substring> <line><xsl:value-of select="." /></line> </xsl:matching-substring> </xsl:analyze-string> </poem> </xsl:template> <poem> <line>Mary had a little lamb,</line> <line>Its fleece was white as snow;</line> <line>And everywhere that Mary went</line> <line>The lamb was sure to go.</line> </poem> What's New in XSLT 2.0
More Text Parsing • Within <xsl:analyze-string>, use regex-group() function to get value of matched subexpression <xsl:template match="@date"> <xsl:attribute name="date"> <xsl:variable name="UK-date-regex" select="(\d{2})\\(\d{2})\\(\d{2})" /> <xsl:analyze-string select="." regex="{$UK-date-regex}"> <xsl:matching-substring> <xsl:sequence select="concat('20', regex-group(3), '-', regex-group(2), '-', regex-group(1))" /> </xsl:matching-substring> </xsl:analyze-string> </xsl:attribute> </xsl:template> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • XSLT 2.0 also allows access to text files with unparsed-text() function • works in a similar way to document() • Potential to process any text format • comma-delimited and fixed-format files • CSS files • HTML? Java code? • these are hard because of matching tags/braces • XSLT could be used for up-conversion to XML What's New in XSLT 2.0
Strong Typing in XSLT 2.0 • XPath 2.0 is strongly typed • the type of a value determines how it is treated by some operators (e.g. +, =) • if the wrong type of value is passed to a function, you will get an error • Similarly, in XSLT 2.0: • the type of a sort key determines how values are sorted • if the wrong type of value is passed as a parameter to a template, you will get an error What's New in XSLT 2.0
Declaring Types of Variables • Declare the type of variables and parameters with as attribute • holds a SequenceType • item test plus occurrence indicator • xs:integer+ means "one or more integers" • element()? means "an optional element" • Error if value doesn't comply with type <xsl:function name="math:power"> <xsl:param name="base" as="xs:decimal" /> <xsl:param name="power" as="xs:integer" /> … </xsl:function> What's New in XSLT 2.0
Declaring Type of Functions • Declare the return type of functions and templates with as attribute • holds a SequenceType • Generated sequence will be converted to atomic sequence type if possible <xsl:function name="math:power" as="xs:decimal"> <xsl:param name="base" as="xs:decimal" /> <xsl:param name="power" as="xs:integer" /> … </xsl:function> <xsl:template match="@date" as="attribute(@date, *)"> <xsl:attribute name="date">…</xsl:attribute> </xsl:template> What's New in XSLT 2.0
Node Typing in XSLT 2.0 • In XPath 2.0, every element and attribute has a type • can select nodes based on their type • //attribute(@*, xs:date) selects all the attributes in the document that hold dates • Similarly, in XSLT 2.0: • can match nodes based on their type • attribute(@*, xs:date) matches all attributes that hold dates • can create elements and attributes of particular types What's New in XSLT 2.0
Creating Elements and Attributes • Use [xsl:]type attribute to indicate type of element/attribute • can use this to type-annotate documents without schema validation <xsl:template match="@date" as="attribute(@date, xs:date)"> <xsl:attribute name="date" type="xs:date"> … </xsl:attribute> </xsl:template> <xsl:template match="employee" as="element(person, xs:token)"> <person xsl:type="xs:token"> <xsl:value-of select="name" /> </person> </xsl:template> What's New in XSLT 2.0
Importing Schemas • Need to import schemas to use: • user-defined types • substitution groups • Import with <xsl:import-schema> • namespace identifies target namespace • schema-location locates schema • Enables validation of result tree <xsl:import-schema namespace="http://www.w3.org/1999/xhtml" schema-location="xhtml.xsd" /> <xsl:template match="element(_inline)"> … </xsl:template> What's New in XSLT 2.0
Implications for XSLT 2.0 Use • Easy to get errors from a stylesheet unless you're rigorous in keeping track of types • declare types of variables and parameters • cast elements/attributes to particular types • Well-designed schemas become a useful tool • substitution groups and appropriate types help reduce number of templates • check whether the result conforms to the schema while generating it What's New in XSLT 2.0
Conclusions • XSLT 2.0 introduces a lot of new features • Many stylesheets can be simpler: • multi-step processing with temporary trees • grouping using <xsl:for-each-group> • user-defined functions for repetitive code • Stylesheet applications can be simpler: • multiple result documents should reduce need for client-side scripting • XSLT 2.0 expands into text parsing • Using schemas/types well will make things easier; not using them will make things harder What's New in XSLT 2.0
For Details… • XPath 2.0 Working Draft: • http://www.w3.org/TR/xpath20 • XSLT 2.0 Working Draft: • http://www.w3.org/TR/xslt20 • Saxon implementation: • http://saxon.sourceforge.net • Comments please! • public-qt-comments@w3.org What's New in XSLT 2.0