1 / 27

Josh Lubell, lubell@nist

Josh Lubell, lubell@nist.gov National Institute of Standards and Technology Manufacturing Systems Integration Division. A Tool Kit for Implementing XML Schema Naming and Design Rules OASIS Symposium: The Meaning of Interoperability May 9, 2006. XML Exchange Schemas are Bridges.

avel
Download Presentation

Josh Lubell, lubell@nist

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Josh Lubell, lubell@nist.gov National Institute of Standards and TechnologyManufacturing Systems Integration Division A Tool Kit for Implementing XML Schema Naming and Design RulesOASIS Symposium: The Meaning of InteroperabilityMay 9, 2006

  2. XML Exchange Schemas are Bridges

  3. But Bridges Must Be Designed Properly

  4. A Solution: Naming and Design Rules • Encode XML schema best practices • Enforce a particular modeling methodology • Ensure common naming conventions • Use of camel case • Allowable acronyms • … • But NDRs can be difficult to apply

  5. Barriers to NDR Usefulness • Proliferation • How do I decide which NDR set to adopt? • Should I develop my own NDR? • Lack of structure • NDR documents usually in proprietary word processor formats • Inhibits rule reuse • Limited versioning and traceability • Ambiguity • Rules written in English rather than computer-interpretable language • NDR enforcement not automatic

  6. Schematron as an NDR Implementation Method • Advantages • XML-native (based on XPath) • Rule-based • Can test for co-occurrence constraints • User-configurable diagnostic messages • ISO standard • Disadvantage • Less versatile than a general purpose programming language

  7. Example from Universal Business Language NDR [ELD1] Each UBL:DocumentSchema MUST identify one and only one global element declaration that defines the document ccts:AggregateBusinessInformationEntity being conveyed in the Schema expression. That global element MUST include an xsd:annotation child element which MUST further contain an xsd:documentation child element that declares “This element MUST be conveyed as the root element in any instance document based on this Schema expression.”

  8. Implementation Observations Rule label Namespace dependence Subrule 1 [ELD1] Each UBL:DocumentSchema MUST identify one and only one global element declaration that defines the document ccts:AggregateBusinessInformationEntity being conveyed in the Schema expression. That global element MUST include an xsd:annotation child element which MUST further contain an xsd:documentation child element that declares “This element MUST be conveyed as the root element in any instance document based on this Schema expression.” Context 1 Context 2 Namespace dependence Subrule 2

  9. UBL Lessons Learned • Implementation non-trivial even for a seemingly simple rule • Some rules require a general purpose programming language for implementation • [GNR1] UBL XML element, attribute and type names MUST be in the English language, using the primary English spellings provided in the Oxford English Dictionary. • [GNR7] UBL XML element, attribute and type names MUST be in singular form unless the concept itself is plural. • Some rules cannot be implemented at all • [NMS6] UBL published namespaces MUST never be changed. • [VER10] UBL Schema and schema module minor version changes MUST not break semantic compatibility with prior versions. • MUST versus SHOULD versus MAY • More on MAY later…

  10. Dept. of Navy (DON) NDR Case Study • 128 rules • Based on UBL NDR • Why choose the DON NDR? • Help developers write better schemas for Federal government applications • Gain insight into best practices for NDR development (particularly reuse of existing NDRs) • Publicly available • A Navy standard

  11. DON NDR Testability (using Schematron)

  12. Issue: Use of MAY • A rule saying that something MAY occur, strictly speaking, will always pass • But this may not be the rule creator’s intent • Example: [CTD8] Code and ID ccts:BBIE Property complex types MAY use the xsd:choice element to reference global elements defined in standardized ID Scheme or Code List Schema modules. • Approaches • Consider rule as guidance only (don’t implement) • Interpret MAY as discouragement, e.g. “warning: referencing global element using xsd:choice”

  13. Issue: Requirement for External Resources [GNR1] UBL XML element, attribute and type names MUST be in the English language, using the primary English spellings provided in the Oxford English Dictionary. • Implementation requires access to electronic OED • And the DON adaptation of this rule has additional requirements: [GNR1] XML element, attribute, and type names MUST be in the English language, using the Oxford English Dictionary for Writers and Editors (Latest Ed.). Where both American and English spellings of the same word are provided, the American spelling MUST be used. • Electronic OED must be fully up to date

  14. Issue: Rule Proliferation • Illustrated by UBL rule GNR1 versus DON rule GNR1 • DON rule same as UBL rule, but with added contraints • American spelling favored • Latest OED edition required • But no explicit relationship specified in DON NDR! • Both rules have same ID, even though they are different rules • Improved traceability and reusability would reduce the confusion

  15. Issue: Ambiguous Terminology • More rigor needed in NDR definitions • Example: “xsd:SchemaExpression” • Not defined in W3C XML Schema recommendation • Used but not defined in DON NDR • Defined in UBL NDR to mean “a concept”

  16. Issue: Mixed Content • Essential for representing semi-structured data • But allowing it makes the NDR more complicated • UBL NDR forbids mixed content • DON NDR allows it, but only if defined by a namespace from a Navy-approved standard (e.g. XHTML) • But XHTML element and attribute names violate rule GNR1!

  17. Quality of Design (QoD) Tool • Contains rules based on naming and design guidelines (NDRs) from a number of sources • Stores executable test cases written in Schematron and Java Expert System Shell (Jess) • Executes tests against user-provided schemas and reports results • Rules grouped into test profiles

  18. Why QoD? • Addresses proliferation of NDRs • Overlapping NDR standards • Supports reusability of rules • Highlights ambiguous rules • Provides an explicit structure for rules in NDRs • Automates rule enforcement • Enables versioning and traceability of rules

  19. Characteristics of Rules • Coverage: full, partial, none • Applicability: indicates type of schema (document, low, or aggregate) the rule applies to • Rationale: reason for rule from a list of justifications • Requirement: text from the NDR document • Implementation File: URI of the file containing the implementation of the rule

  20. Example XML Description of a Ruleusing QoD Exchange Schema <testProfile> <sourceid="ubl"> <organization>OASIS</organization> <orgURL>http://www.oasis-open.org</orgURL> <title>Universal Business Language (UBL) Naming and Design Rules</title> <version>1.0</version> <date>2004-11-15</date> <docURL>http://docs.oasis-open.org/ubl/cd-UBL-NDR-1.0.1</docURL> </source> <ruleSetid="ELD"> <name>Element Declaration Rules</name> <ruleid="ELD1"> <coverage>full</coverage> <schema>D</schema> <rationale>structural clarity</rationale> <requirement>Each UBL:DocumentSchema MUST identify one and ... </requirement> <implementationfile="example.scmt#eld1" type="schematron"/> </rule> ... </testProfile>

  21. QoD Test Profile Exchange

  22. Application to Developing XML Schemas • Currently a limited set of rules are implemented • Recently implemented subset of DON NDR in Schematron • Tested with a small but varied set of sample schemas • Navy – IETM Schema Q70:IETM (Interactive Electronic Technical Manual) • Grants.gov • AEX (building and construction industry) • US Dept. of Defense • Provided meaningful results to schema developers

  23. Examples of types of warnings found in developing XML Schemas • Global elements declared in non-desirable places • Anonymous/local types defined in non-desirable places • “Global” schemas that do not declare a default namespace • Document/Transaction level schemas that define multiple global elements • Re-declaration of elements and types (e.g. programType) in different namespaces

  24. Lesson Learned in coding NDRs • NDR documents need to be regarded as rigorous technical documentation • More review needed • Better authoring tools needed • Rules that cannot be implemented are non-enforceable • Definition of NDRs is non-trivial • Many rules cannot be tested • Many rules are more difficult to implement than thought • Difficult to reuse rules due to namespace definitions • Often rules are ambiguous or unclear • Implementation of rules is non-trivial • Testing of rules is complex • All boundary conditions need to be thought of and covered • Legacy data and 3rd party schemas need to be addressed in NDRs

  25. What’s Next • Continue to expand our NDR rule-base • Continue to enhance software based on user requirements • Produce a tool kit for NDR developers • Enhance QoD schema to represent entire NDR document • Provide authoring templates • Identify collaborators for future work • If interested, contact me!

  26. Summary • A process for XML schema development is necessary • Tools can automate the process, thereby reducing labor and deployment time • Definition and implementation of NDRs is non-trivial but necessary to support reuse of schemas • Enforcing NDRs will ultimately make XML schemas more interoperable

More Related