420 likes | 444 Views
Extending XML Schemas. XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev. Not “All Powerful”. XML Schemas is very powerful
E N D
Extending XML Schemas XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev
Not “All Powerful” • XML Schemas is very powerful • However, it is not "all powerful". There are many constraints that it cannot express. Here are some examples: • Ensure that the value of the aircraft <Elevation> element is greater than the value of the obstacle <Height> element. • Ensure that: • if the value of the attribute, mode, is "air", then the value of the element, <Transportation>, is either airplane or hot-air balloon • if mode="water" then <Transportation> is either boat or hovercraft • if mode="ground" then <Transportation> is either car or bicycle. • Ensure that the value of the <PaymentReceived> is equal to the value of <PaymentDue>, where these elements are in separate documents! • To check all our constraints we will need to supplement XML Schemas with another tool. In this tutorial we will cover each of these examples and show how the constraints may be implemented using either XSLT/XPath or Schematron. At the end we will examine the pros and cons of each approach.
Two Approaches to Extending XML Schemas • XSLT/XPath • The first approach is to supplement the XSD document with a stylesheet • Schematron • The second approach is to embed the additional constraints within <appinfo> elements in the XSD document. Then, a tool (Schematron) will extract and process those constraints.
Enhancing XML Schemas using XSLT/XPath XML Schema Schema Validator valid Your XML data Now you know your XML data is valid! XSL Processor valid XSLT Stylesheet containing code to check additional constraints
Enhancing XML Schemas using Schematron Schematron valid XML Schema Your XML data Now you know your XML data is valid! Schema Validator valid
First Example: Verify that A > B <?xml version="1.0"?> <Demo xmlns="http://www.demo.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.demo.org demo.xsd"> <A>10</A> <B>20</B> </Demo> Constraints which must be implemented: 1. <Demo> contains a sequence of elements: <A> then <B> 2. <A> contains an integer; <B> contains an integer 3. The value of element <A> is greater than the value of element <B> Note: an XML Schema document can implement the first two constraints, but not the last.
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.demo.org" xmlns="http://www.demo.org" elementFormDefault="qualified"> <xsd:element name="Demo"> <xsd:complexType> <xsd:sequence> <xsd:element name="A" type="xsd:integer"/> <xsd:element name="B" type="xsd:integer"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> Demo.xsd Note that this schema implements constraints 1 and 2.
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:d="http://www.demo.org" version="1.0"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:if test="/d:Demo/d:A < /d:Demo/d:B"> <xsl:text>Error! A is less than B</xsl:text> <xsl:text>
</xsl:text> <!-- carriage return --> <xsl:text>A = </xsl:text><xsl:value-of select="/d:Demo/d:A"/> <xsl:text>
</xsl:text> <!-- carriage return --> <xsl:text>B = </xsl:text><xsl:value-of select="/d:Demo/d:B"/> </xsl:if> <xsl:if test="/d:Demo/d:A >= /d:Demo/d:B"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template> </xsl:stylesheet> Demo.xsl (see example01/xslt) This stylesheet implements constraint 3!
Output • Here is the output that results when the stylesheet is run against demo.xml: Error! A is less than B A = 10 B = 20
Introduction to Schematron • Schematron enables you to embed assertions (which are expressed using XPath syntax) within <appinfo> elements. Schematron XML Schema Valid/invalid Extract assertions from <appinfo> elements XML Data
Introduction to Schematron (cont.) • State a constraint using an <assert> element • A <rule> is comprised of one or more <assert> elements • A <pattern> is comprised of one or more <rule> elements <pattern> <rule> <assert> … </assert> <assert> … </assert> </rule> </pattern>
The <assert> Element • Use the assert element to state a constraint <assert test="d:A > d:B">A should be greater than B</assert> XPath expression Text description of the constraint In the <assert> element you express a constraint using an XPath expression. Additionally, you state the constraint in natural language. The later helps to make your schemas self-documenting.
The <rule> Element • Use this element to specify the context for the <assert> elements. <rule context="d:Demo"> <assert test="d:A > d:B">A should be greater than B</assert> </rule> "Within the context of the Demo element, we assert that the A element should be greater than the B element." The value of the context attribute is an XPath expression.
The <diagnostic> Element • You can associate an <assert> element with a <diagnostic> element. • Use the <diagnostic> element for printing error messages when the XML data fails the assertion. • In the <diagnostic> element you can use: • Literal strings • The <value-of …> element (from XSLT) to print out the value of elements • The <diagnostic> elements are embedded within a <diagnostics> element. • The diagnostics go after a <pattern> element.
<pattern name="Check A greater than B"> <rule context="d:Demo"> <assert test="d:A > d:B" diagnostics="lessThan"> A should be greater than B </assert> </rule> </pattern> <diagnostics> <diagnostic id="lessThan"> Error! A is less than B A = <value-of select="d:A"/> B = <value-of select="d:B"/> </diagnostic> </diagnostics>
Schematron Schema • Recall that Schematron extracts the assertions, rules, etc out of the <appinfo> elements. • It builds a Schematron schema from the stuff that it extracts. XML Schema Schematron Schema Schematron Valid/invalid Extract assertions, rules, etc from the <appinfo> elements Demo.xsd Demo.xml_sch XML Data Demo.xml
Schematron Schema (cont.) • Schematron requires all the Schematron elements (e.g., <assert>, <rule>, etc) to be qualified with the namespace prefix sch: • This is how the Schematron engine identifies the Schematron elements.
<sch:pattern name="Check A greater than B"> <sch:rule context="d:Demo"> <sch:assert test="d:A > d:B" diagnostics="lessThan"> A should be greater than B </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="lessThan"> Error! A is less than B A = <sch:value-of select="d:A"/> B = <sch:value-of select="d:B"/> </sch:diagnostic> </sch:diagnostics>
Need to Inform Schematron of the Namespace of the XML Data • You need to inform Schematron of the namespace of the XML data. • Also, you need to tell it the namespace prefix that you are using in the <appinfo> section to identify the XML data elements. • Use the <ns> element to do this. <sch:ns prefix="d" uri="http://www.demo.org"/>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.demo.org" xmlns="http://www.demo.org" xmlns:sch="http://www.ascc.net/xml/Schematron” elementFormDefault="qualified"> <xsd:annotation> <xsd:appinfo> <sch:title>Schematron validation</sch:title> <sch:ns prefix="d" uri="http://www.demo.org"/> </xsd:appinfo> </xsd:annotation> <xsd:element name="Demo"> <xsd:annotation> <xsd:appinfo> <sch:pattern name="Check A greater than B"> <sch:rule context="d:Demo"> <sch:assert test="d:A > d:B" diagnostics="lessThan">A should be greater than B</sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="lessThan"> Error! A is less than B. A = <sch:value-of select="d:A"/> B = <sch:value-of select="d:B"/> </sch:diagnostic> </sch:diagnostics> </xsd:appinfo> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="A" type="xsd:integer" minOccurs="1" maxOccurs="1"/> <xsd:element name="B" type="xsd:integer" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> See example01 Do Lab1
Example 2 • Constraint: Ensure that an Aircraft’s Elevation is greater than the Height of all Obstacles <?xml version="1.0"?> <Flight xmlns="http://www.aviation.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.aviation.org Aircraft.xsd"> <Aircraft type="Boeing 747"> <Elevation units="feet">3300</Elevation> </Aircraft> <Obstacle type="mountain"> <Height units="feet">26000</Height> </Obstacle> </Flight> Check that the aircraft is higher than all obstacles. In this example it isn’t, so we want to catch this (kind of an important check, don’t you agree?) Aircraft.xml (see example02)
XSLT Implementation of Checking the Constraint <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="http://www.aviation.org" version="1.0"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:if test="/a:Flight/a:Aircraft/a:Elevation < /a:Flight/a:Obstacle/a:Height"> <xsl:text>Danger! The aircraft is about to collide with a </xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/@type"/><xsl:text>!</xsl:text> <xsl:text>
</xsl:text> <xsl:text>aircraft elevation = </xsl:text><xsl:value-of select="/a:Flight/a:Aircraft/a:Elevation"/> <xsl:text>
</xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/@type"/><xsl:text> height = </xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/a:Height"/> </xsl:if> <xsl:if test="/a:Flight/a:Aircraft/a:Elevation >= /a:Flight/a:Obstacle/a:Height"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template> </xsl:stylesheet> Aircraft.xsl (see example02)
Output • Here is the output that results when the stylesheet is run against Aircraft.xml: Danger! The aircraft is about to collide with a mountain! aircraft elevation = 3300 mountain height = 26000
Schematron Implementation of Checking the Constraint <xsd:annotation> <xsd:appinfo> <sch:pattern name="Check Aircraft Elevation is greater than Obstacle Height"> <sch:rule context="a:Flight"> <sch:assert test="a:Aircraft/a:Elevation > a:Obstacle/a:Height" diagnostics="lessThan"> The Aircraft Elevation should be greater than Obstacle Height. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="lessThan"> Danger! The aircraft is about to collide with a <sch:value-of select="a:Obstacle/@type"/>... aircraft elevation = <sch:value-of select="a:Aircraft/a:Elevation"/>... <sch:value-of select="a:Obstacle/@type"/> height = <sch:value-of select="a:Obstacle/a:Height"/> </sch:diagnostics> </xsd:appinfo> </xsd:annotation> (This annotation is embedded within Aircraft.xsd) Do Lab2
Example 3 • Constraint: Ensure that the value of the <Transportation> element is appropriate for the value specified by the mode attribute (see table below) Transportation mode airplane, hot-air balloon air boat, hovercraft water car, bicycle ground
<?xml version="1.0"?> <JourneyToTibet xmlns="http://www.journey-to-tibet.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.journey-to-tibet.org journeyToTibet.xsd"> <Trip segment="1" mode="air"> <Transportation>boat</Transportation> </Trip> <Trip segment="2" mode="water"> <Transportation>boat</Transportation> </Trip> <Trip segment="3" mode="ground"> <Transportation>car</Transportation> </Trip> <Trip segment="4" mode="water"> <Transportation>hovercraft</Transportation> </Trip> <Trip segment="5" mode="air"> <Transportation>hot-air balloon</Transportation> </Trip> </JourneyToTibet> Need to check each Trip element to ensure that the Transportation element has a legal value for the given mode. In this example the first Trip (segment) is illegal and the remaining segments are legal. journeyToTibet.xml (see example03)
XSLT Implementation of Checking the Constraint <xsl:template match="j:JourneyToTibet"> <xsl:for-each select="j:Trip"> <xsl:choose> <xsl:when test="(@mode='air') and (not((j:Transportation='airplane') or (j:Transportation='hot-air balloon')))"> <xsl:text>Error! When the mode is 'air' then Transportation must be either airplane or hot-air balloon</xsl:text> <xsl:text>
</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>
</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>
</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>
</xsl:text> <xsl:text>
</xsl:text> </xsl:when> <xsl:when test="(@mode='water') and (not((j:Transportation='boat') or (j:Transportation='hovercraft')))"> <xsl:text>Error! When the mode is 'water' then Transportation must be either boat or hovercraft</xsl:text> <xsl:text>
</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>
</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>
</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>
</xsl:text> <xsl:text>
</xsl:text> </xsl:when> Cont. -->
<xsl:when test="(@mode='ground') and (not((j:Transportation='car') or (j:Transportation='bicycle')))"> <xsl:text>Error! When the mode is 'ground' then Transportation must be either car or bicycle</xsl:text> <xsl:text>
</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>
</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>
</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>
</xsl:text> <xsl:text>
</xsl:text> </xsl:when> <xsl:otherwise> <xsl:text>segment </xsl:text><xsl:value-of select="@segment"/><xsl:text> is valid.</xsl:text> <xsl:text>
</xsl:text> <xsl:text>
</xsl:text> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:template> journeyToTibet.xsl (see example03)
Output • Here is the output that results when the stylesheet is run against journeyToTibet.xml: Error! When the mode is 'air' then Transportation must be either airplane or hot-air balloon segment = 1 mode = air Transportation = boat segment 2 is valid. segment 3 is valid. segment 4 is valid. segment 5 is valid.
Schematron Implementation of Checking the Constraint <xsd:annotation> <xsd:appinfo> <sch:pattern name="Check the Transportation value is appropriate for the given mode"> <sch:rule context="j:Trip"> <sch:assert test="((@mode='air') and (j:Transportation='airplane')) or ((@mode='air') and (j:Transportation='hot-air balloon')) or ((@mode='water') and (j:Transportation='boat')) or ((@mode='water') and (j:Transportation='hovercraft')) or ((@mode='ground') and (j:Transportation='car')) or ((@mode='ground') and (j:Transportation='bicycle'))" diagnostics="wrongTransportationForTheMode"> If the mode is air then Transportation should be either airplane or hot-air balloon. If the mode is water then Transportation should be either boat or hovercraft. If the mode is ground then Transportation should be either car or bicycle. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="wrongTransportationForTheMode"> Error! Transportation does not have a legal value for the given mode. segment = <sch:value-of select="@segment"/> mode = <sch:value-of select="@mode"/> Transportation = <sch:value-of select="j:Transportation"/> </sch:diagnostic> </sch:diagnostics> </xsd:appinfo> </xsd:annotation> (This annotation is embedded within journeyToTibet.xsd)
Alternate Schematron Implementation <xsd:appinfo> <sch:pattern name="Check the Transportation value is appropriate for the given mode"> <sch:rule context="j:Trip[@mode='air']"> <sch:assert test="(j:Transportation='airplane') or (j:Transportation='hot-air balloon')" diagnostics="wrongTransportationForTheMode"> If the mode is air then Transportation should be either airplane or hot-air balloon. </sch:assert> </sch:rule> <sch:rule context="j:Trip[@mode='water']"> <sch:assert test="(j:Transportation='boat') or (j:Transportation='hovercraft')" diagnostics=" wrongTransportationForTheMode "> If the mode is water then Transportation should be either boat or hovercraft. </sch:assert> </sch:rule> <sch:rule context="j:Trip[@mode='ground']"> <sch:assert test="(j:Transportation='car') or (j:Transportation='bicycle')" diagnostics=" wrongTransportationForTheMode "> If the mode is ground then Transportation should be either car or bicycle. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id=" wrongTransportationForTheMode "> Error! Transportation does not have a legal value for the given mode. segment = <sch:value-of select="@segment"/>. mode = <sch:value-of select="@mode"/>. Transportation = <sch:value-of select="j:Transportation"/>. </sch:diagnostic> </sch:diagnostics> </xsd:appinfo> (see journeyToTibet_v2.xsd in example03)
Observation • In this last example there is a definite difference in the complexity of implementations. Schematron provides a much simpler way of expressing the constraints than XSLT. Do Lab3
Example 4 • Constraint: Ensure that the value of the <PaymentReceived>equals the value of <PaymentDue>, where these two elements are in different documents. Check that these are equal. <PaymentReceived>…</PaymentReceived> <PaymentDue>…</PaymentDue> CheckingAccount.xml Invoice.xml
<?xml version="1.0"?> <Invoice xmlns="http://www.invoice.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.invoice.org Invoice.xsd"> <PaymentDue currency="USD">199.00</PaymentDue> </Invoice> Invoice.xml (see example04) <?xml version="1.0"?> <CheckingAccount xmlns="http://www.checking-account.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.checking-account.org CheckingAccount.xsd"> <PaymentReceived currency="USD">79.00</PaymentReceived> </CheckingAccount> CheckingAccount.xml (see example04)
XSLT Implementation of Checking the Constraint <xsl:template match="i:Invoice"> <xsl:variable name="checking-account" select="document('file://localhost/.../CheckingAccount.xml')"/> <xsl:if test="not(i:PaymentDue = $checking-account/c:CheckingAccount/c:PaymentReceived)"> <xsl:text>Payment due is NOT equal to payment received!</xsl:text> <xsl:text>
</xsl:text> <xsl:text> Payment due = </xsl:text><xsl:value-of select="i:PaymentDue"/> <xsl:text>
</xsl:text> <xsl:text> Payment received = </xsl:text> <xsl:value-of select="$checking-account/c:CheckingAccount/c:PaymentReceived"/> </xsl:if> <xsl:if test="i:PaymentDue = $checking-account/c:CheckingAccount/c:PaymentReceived"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template> CheckDocuments.xsl (see example04)
Output • Here is the output that results when the stylesheet is run against Invoice.xml and CheckingAccount.xml: Payment due is NOT equal to payment received! Payment due = 199.00 Payment received = 79.00
Schematron Implementation of Checking the Constraint <xsd:appinfo> <sch:pattern name="Check payment due equals payment received"> <sch:rule context="i:Invoice"> <sch:assert test="i:PaymentDue = document('file://.../CheckingAccount.xml')/c:CheckingAccount/c:PaymentReceived" diagnostics="amountsDiffer"> Payment due should equal payment received. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="amountsDiffer"> Payment due is NOT equal to payment received! ... Payment due = <sch:value-of select="i:PaymentDue"/>... Payment received = <sch:value-of select="document('file://.../CheckingAccount.xml')/c:CheckingAccount/c:PaymentReceived"/> </sch:diagnostic> </sch:diagnostics> </xsd:appinfo> (This appinfo is embedded within Invoice.xsd)
Advantages of using XSLT/XPath to Implement Additional Constraints • Application Specific Constraint Checking: Each application can create its own stylesheet to check constraints that are unique to the application. Thus, each application can enhance the schema without touching it! • Core Technology: XSLT/XPath is a "core technology" which is well supported, well understood, and with lots of material written on it. • Expressive Power: XSLT/XPath is a very powerful language. Most, if not every, constraint that you might ever need to express can be expressed using XSLT/XPath. Thus you don't have to learn multiple schema languages to express your additional constraints • Long Term Support: XSLT/XPath is well supported, and will be around for a long time.
Disadvantages of using XSLT/XPath to Implement Additional Constraints • Separate Documents: With this approach you write your XML Schema document, then you write a separate XSLT/XPath document to express additional constraints. Keeping the two documents in synch needs to be carefully managed • Overkill? On the previous slide it was stated that an advantage of using XSLT/XPath is that it a very rich, expressive language. As a language for expressing constraints perhaps it is too much. As we have seen with these examples, only a few XSLT/XPath elements were needed to express all the contraints. So XSLT/XPath gives a lot of unnecessary overhead.
Advantages of using Schematron to Implement Additional Constraints • Collocated Constraints: We saw how Schematron could be used to express additional constraints. We saw that you embed the Schematron directives within the XML Schema document. There is something very appealing about having all the constraints expressed within one document rather than being dispersed over multiple documents. • Application Specific Constraint Checking: In all of our examples we showed the Schematron elements embedded within the XSD document. However, that is not necessary. Alternatively, you can create a standalone Schematron schema. With this later approach applications can then create a Schematron schema to express their unique constraints. • Simpler: The Schematron vocabulary is very simple, comprised of a handful of elements. As we have seen, with this handful of elements we are able to express all the desired assertions.
Disadvantages of using Schematron to Implement Additional Constraints • Multiple Schema Languages may be Required: From the examples shown in this tutorial it appears that Schematron is sufficiently powerful to express any additional constraints that you might have. It is possible, however, that there are constraints that it cannot express, or cannot express easily. Consequently, you may need to supplement Schematron with another language to express all the additional constraints.
Inside Schematron • The “Schematron engine” is actually just an XSL stylesheet! XML Schema Schematron Schema validate.xsl Schematron-basic.xsl skeleton1-5.xsl XSD2Schtrn.xsl Extract assertions, rules, etc from the <appinfo> elements Convert to a stylesheet Demo.xsd Demo.xml_sch XSL Processor XML Data valid/invalid Demo.xml