1 / 12

XML Schema Validation: Not Exactly a Science?

XML Schema Validation: Not Exactly a Science?. Presentation to CIOC XML Working Group. January 21, 2004 Ken Sall ( ksall@SiloSmashers.com ) [based on xml-dev discussion initiated by Betty Harvey ]. Mixed Results with ET Schema.

eden
Download Presentation

XML Schema Validation: Not Exactly a Science?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Schema Validation: Not Exactly a Science? Presentation to CIOC XML Working Group January 21, 2004 Ken Sall (ksall@SiloSmashers.com) [based on xml-dev discussion initiated by Betty Harvey]

  2. Mixed Results with ET Schema • Our ET Pilot XML Schema validated with 2 parsers, but failed validation with 3 parsers. • It passed another application’s parsing but then the application 'blew up' when trying to validate or parse an instance document of the schema. • We were migrating between MS Windows and Linux, as well as using online XSV validator from W3C. • All parsers that failed generated different error messages! • We never really determined the exact problem.

  3. XML Schema Checking Service • XML Schema Validator (W3C): XSV 2.5-2 of 2003/07/09 produced these error messages • http://www.eccnet.com/ET-Register/InformationTechnologyComponent.xsd:791:3: Invalid per cvc-complex-type.1.3: undeclared attribute {None}:minOccurs • http://www.eccnet.com/ET-Register/InformationTechnologyComponent.xsd:791:3: Invalid per cvc-complex-type.1.3: undeclared attribute {None}:maxOccurs

  4. XSV.exe Command Line C:\KEN> xsv InformationTechnologyComponent.xsd <?xml version='1.0'?> <xsv xmlns='http://www.w3.org/2000/05/xsv' docElt='{http://www.w3.org/2001/XMLSc hema}schema' instanceAssessed='true' instanceErrors='2' rootType='[Anonymous]' s chemaErrors='0' target='file:/C:/KEN/InformationTechnologyComponent.xsd' validat ion='strict' version='XSV 1.6 of 2002/09/23 21:47:52'> <invalid char='3' code='cvc-complex-type.1.3' line='791' resource='file:/C:/KE N/InformationTechnologyComponent.xsd'>undeclared attribute {None}:maxOccurs</inv alid> <invalid char='3' code='cvc-complex-type.1.3' line='791' resource='file:/C:/KE N/InformationTechnologyComponent.xsd'>undeclared attribute {None}:minOccurs</inv alid> </xsv>

  5. Altova XML Spy (also Corel XMetaL)

  6. Betty Harvey's xml-dev 11/25/03 Posting • What is the "Mother of All Schema Parsers"?  • What schema parser can we trust? Are there any Gold Standards? • Are the differences in the parsers because of the complexity and/or ambiguities of the Schema spec? • Most important: How can XML implementers use XML Schema in a collaborative, multi-tool environment?

  7. Henry Thompson's Replies • "Without being specific in a way which would be inappropriate given my W3C position, I would say that my experience over the last six months is a significant decrease in interop problems among the three or four best schema processors. Unfortunately one of the ones _not_ in that category is also one of the most widely used/distributed, which skews people's perceptions." • "I have seen very few examples in the last six months where XSV, SQC and Xerces [W3C, IBM's XML Schema Quality Checker, and Apache, respectively] don't agree, and most of those arise because of areas of the REC which XSV has never implemented." • "I definitely would not want this comment to be understood as a positive or negative assessment of any other product." Yes, but which products aren't mentioned?

  8. U.S. Government Recommends XML Schema (and DTDs) "Only ISO 8879 Document Type Definitions and W3C Schema Part 1:Structures and W3C Schema Part 2:Datatypes SHALL be used to define XML document structures. Developers of data-oriented schemas in DTD syntax SHOULD migrate to XML Schemas. Developers MAY elect to use DTDs for markup of data that is strictly document-oriented (sentence, paragraph, chapter, appendix, etc.). However, the XML Schema language is the preferred method." -- CIOC XML Developers Guide, April 2002

  9. Joe Chiusano and G.Ken Holman • JC: "I would also like to emphasize that the Federal XML Developer's Guide is guidance, and not government policy." • GKH: "Regardless, it gets *treated* as policy (and they should have realized that, much of what the government calls guidance gets treated de facto as policy) and the ripple effect has been felt far and wide by such guidance not to consider alternatives." • Claude (Len) Bullard: "A procurement official can usually waiver this stuff, but when it shows up in some documents, it can become a mandatory." • GKH: "My understanding is that non-technical influences impacted on the decision to use W3C Schema and not even regard available technologies such as RELAX-NG as viable candidates."

  10. Elliotte Rusty Harold • "I do not consider DTDs at all deprecated or legacy. I know the opposite opinion is out there, but it's wrong. I wrote about this in Chapter 24 of Effective XML (not online yet, sorry), and I have been careful in all my talks about schemas to make sure that everyone knows they are not a replacement for DTDs. " • "I also see a lot of evidence that the W3C XML Schema Language is losing the schema wars to RELAX NG. Many high profile groups have chosen to adopt RELAX NG for their schema needs rather than the W3C XML Schema Language. The prime reason for the W3C XML Schema Language's current and, I think, transitory prominence is merely the W3C imprimatur. Among developers who realize they have a choice, the choice is increasingly likely to be RELAX NG." • "The real innovation of XML was not making the DTDs simpler. It was making them optional. Documents that do not have document type declarations are incredibly interoperable, with almost no room for parser differences in interpretation."

  11. Dare Obasanjo • "Do you mean that it validates according to the rules at http://www.w3.org/TR/xmlschema-1/ or the rules at http://www.w3.org/TR/xmlschema-1/ + http://www.w3.org/2001/05/xmlschema-errata ?" • Ken Sall: The issue here is that some processors may go strictly by the XSD 1.0 W3C Recommendation while others may have already incorporated changes indicated by the errata. But since there's no version number differentiating the two, there's no way to tell what a parser supports, except by contacting the product's techies. • Note: xmlschema-errata is 110 pages, as of 1/15/04!

  12. Possible Conclusions • Hard to recommend any one XML Schema tool. XML Spy, although the most popular by far, has flaws. Henry Thompson’s implicit recommendation might be XSV, SQC or Xerces , but these all lack GUIs. • XML Schema is not an exact science. Too complex, too ambiguous, and has lots of errata external to XML Schema 1.0 spec. • DTDs and even no schema at all are viable alternatives for some applications that don’t require strong datatyping. • RELAX-NG is worth investigating. Why do so many developers prefer it to XSD? RELAX-NG became an ISO standard in 12/03. • For the next version of the XML Developers Guide, the CIOC should consider re-wording the strict use of XSD (i.e., might permit RELAX-NG [OASIS RELAX TC).

More Related