470 likes | 478 Views
A comprehensive set of tools for developing and validating XML schemas, including schema validation, instance validation, and quality of design analysis.
E N D
Tools for Developing and Validating XML Schemas KC Morris US/DoC/NIST/MEL/MSID National Institute of Standards and Technology
Outline of the Talk • Who we are • Process for developing content standards: the Model Development Life Cycle (MDLC) • Overview of our XML-related tools • Quality of Design Tool • Experience with QOD and NDRs National Institute of Standards and Technology
Who We AreUS/DoC/NIST/MEL/MSID • United States Government • Department of Commerce • National Institute of Standards and Technology • Manufacturing Engineering Laboratory • Manufacturing Systems Integration Division = A government resource that provides technical solutions to advance system integration capabilities. National Institute of Standards and Technology
NIST Interoperability Testbed Legacy Migration Through Semantic Mapping Automotive Inventory Visibility CAD/CAM Integration Simulation System Integration Manufacturing MetrologyInteroperability (Inspection) Process Plant Construction Information Integration ManufacturingB2B Integration Integrated CircuitManufacturing Government Systems Integration “Generic” Testing Infrastructure and Tool Development(e.g., XML, ebXML, schema quality, test case generation) Semantic Web R&D National Institute of Standards and Technology
Background • NIST B2B Interoperability Testbed • XML-based interoperability project with the automotive and aerospace industries • NIST AEX Testbed • XML-based interoperability project with the building construction industry • Product Data Exchange and Validation Testing activities • Data exchange-based integration project, not XML-based (ISO10303: STEP) • Common characteristics • Specifications are segmented • Data exchange specifications evolve as integration projects proceed • Also true for standards development efforts National Institute of Standards and Technology
Content Standards are Bridges National Institute of Standards and Technology
We Need a Process for Building Interoperability Bridges National Institute of Standards and Technology
Model Development Life Cycle A guideline for building industrial strength data exchange bridges, that provides • Detailed analysis of the development process for content standards • Architecture for implementing that process • Outline of requirements for tools • Roadmap to the standards landscape National Institute of Standards and Technology
Some Specific Problems in Model Development • Unbounded specification growth • Semantically duplicate terms, components, and documents are created • Poor documentation reduces reuse • Classic interoperability problems are revisited • Large-scale harmonization (using a common/canonical model) is hard to achieve National Institute of Standards and Technology
Existing Schemas & Classification Scheme Link Annotations Registry & Repository Implementation External Classification Documentation Ontologiess Scheme Requirement Gaps Registry Entry Model Data Model Discovery Schema Existing Exchange Registration Semantics Data Requirements Design A1 A4 Rules New Specification Specifications Classification Semantic Aware Assistant Lookup Assistant Existing Schemas Change Requests Model Validation Business Forms Table A2 Of Test Suites Terms Rule Spread Specification Tools Based Specification sheet Engines Reference Data Implementation Business Qualified Context Rules Schemas Guidelines Discovered Schema External Model Piloting Ontologies Implementation A3 Data Model Integration XSLT Annotation Schematron Engine Tools Change A5 Requests Semantic Semantic Annotation Similarity Alignment Tools Measure Algorithm Decomposition of the Model Development Life Cycle
Overview of NIST’s XML Schema Tools National Institute of Standards and Technology
XML Schema Validation Web Service • Objective: Ensure that schemas are compatible with a selected set of parsers • Core Functionality: Validate one or more schemas or schema extensions with multiple selected parsers and schema files stored in a repository • Status: on-line at http://www.mel.nist.gov/msid/validation/ National Institute of Standards and Technology
XML Instance Validation Web Service • Objective: Ensure that schemas are compatible with a selected set of parsers, a set of sample data, a previous set of sample data (when the schemas have gone through changes) • Core Functionality: Automatically validate one or more instance files against associated schemas with multiple selected parsers • Status: on-line at http://www.mel.nist.gov/msid/validation/ National Institute of Standards and Technology
Quality of Design Tool • Objective: Ensure that XML schemas conform to a selected set of design practices, e.g., use of common and valid terms, NDR conformance, use of the XML Schema structures that enhance reusability, maintainability, clarity, and interoperability • Core Functionality: A flexible environment for specifying and executing best practice rules against the schemas • Status: Beta – Access available upon request. Contact Serm, serm@nist.gov National Institute of Standards and Technology
XML Schema Naming Assister • Objective: Ensure that type, element, and attribute names used in schemas are consistent within the schema and conform to ISO 11179 Naming Convention • Core Functionality: Decompose names into Object Class, Property, and Representation Term tokens, validate them using a table of terms, and suggest alternate names • Status: Prototype. Source code available at http://www.nist.gov/msid/Naming_Assister.html National Institute of Standards and Technology
Content Checking Tool • Objective: Capture, codify, and execute business rules that are not captured in the XML Schema. • Core Functionality: Store, publish, and execute business rules for checking instance data for conformance with those rules. • Status: on-line at http://syseng.nist.gov/b2bTestbed/projects/semanticChecking National Institute of Standards and Technology
Schematron Editor Tool • Objective: Assist a user in creating Schematron Rules • Core Functionality: Create Schematron with little or no knowledge of XPATH/XSLT syntax through expression wizards that allow drag-n-drop elements from an imported XML schema business document • Status: Prototype – available on Sourceforge site http://www.sf.net/projects/cs-wizard National Institute of Standards and Technology
Semantic Lookup Assistant • Objective: Assist the user when searching for a reusable business document or components that support data exchange requirements • Core Functionality: Match data exchange requirements from the user with existing schemas and provide quantitatively measured results • Status: Conceptual design started National Institute of Standards and Technology
Classification Assistant • Objective: Provide a quantitative measure suggesting a suitable classification to register a component within a classification scheme • Core Functionality: Given a data exchange specification (and documentation), proposed a ranked set of appropriate classification nodes • Status: Initial requirements analysis National Institute of Standards and Technology
Semantic Alignment Tool • Objective: Provide quantitative analysis and suggestions for model harmonization • Core Functionality: Analyze a newly registered data exchange specification against existing ones for semantically duplicative and overlapping structures and suggest alternatives • Status: Initial requirements analysis National Institute of Standards and Technology
Quality of Design Tool National Institute of Standards and Technology
What is Schema Qualification? Tests that a schema follows an NDR • Uses terms/names correctly and consistently • Consistently represents similar concepts • Uses constructs that enhance reusability, maintainability, clarity, and interoperability As well as • Works with relevant tools • Uses preexisting schemas correctly National Institute of Standards and Technology
Schema Quality of Design Testing Tool • Contains rules based on naming and design guidelines (NDRs) from a number of sources • Stores executable test cases written in Schematron and Java Expert System Shell (JESS) • Executes tests against user-provided schemas and reports results National Institute of Standards and Technology
Why QOD? • Addresses proliferation of NDRs • Overlapping NDR standards • Supports reusability of rules • Highlights ambiguous rules • Provides an explicit structure for rules in NDRs • Automates rule enforcement • Enables versioning and traceability of rules National Institute of Standards and Technology
Candidate NDRs • OASIS Universal Business Language (UBL) • US Department of the Navy (DON) • Korean Institute for Electronic Commerce • Open Applications Group (OAGIS) • US Air Force • US Federal CIO Council XML Working Group • ASC X12 (CICA) • FIATECH (capital facilities industry) National Institute of Standards and Technology
Characteristics of Rules • Coverage: full, partial, none • Applicability: indicates type of schema (document, low, or aggregate) the rule applies to • Rationale: reason for rule from a list of justifications • Requirement: text from the NDR document • Implementation File: URI of the file containing the implementation of the rule National Institute of Standards and Technology
Example XML Description of a Ruleusing QOD Schema <testProfile> <source id="ubl"> <organization>OASIS</organization> <orgURL>http://www.oasis-open.org</orgURL> <title>Universal Business Language (UBL) Naming and Design Rules</title> <version>1.0</version> <date>2004-11-15</date> <docURL>http://docs.oasis-open.org/ubl/cd-UBL-NDR-1.0.1</docURL> </source> <ruleSet id="ELD"> <name>Element Declaration Rules</name> <rule id="ELD1"> <coverage>full</coverage> <schema>D</schema> <rationale>structural clarity</rationale> <requirement>Each UBL:DocumentSchema MUST identify one and ... </requirement> <implementation file="example.scmt#eld1" type="schematron"/> </rule> ... </testProfile> National Institute of Standards and Technology
Architecture of QOD National Institute of Standards and Technology
Application to Developing XML Schemas • Currently a limited set of rules are implemented • Actively working on more rules based on DON NDR • Tested with a small but varied set of sample schemas • Navy – IETM Schema Q70:IETM (Interactive Electronic Technical Manual) • Grants.gov • Aex (building and construction industry) • DLA • Provided meaningful results to schema developers National Institute of Standards and Technology
Examples of types of warnings found in developing XML Schemas • Gobal elements declared in non-desirable places • Anonymous/local types defined in non-desirable places • “Global” schemas that do not declare a default namespace • Document/Transaction level schemas that define multiple global elements • Non-determinism • Redeclaration of elements and types (e.g. programType) in different namespaces National Institute of Standards and Technology
Lesson Learned in coding NDRs • NDR documents need to be regarded as rigorous technical documentation • Rules that can not be implemented are non-enforceable • Definition of NDR’s is non-trivial • Many rules can not be tested • Many rules are more difficult to implement than thought • Difficult to reuse rules due to namespace definitions • Often rules are ambiguous or unclear • Implementation of rules is non-trivial • Testing of rules is complex • All boundary conditions need to be thought of and covered • Legacy data and 3rd party schemas need to be addressed in NDRs National Institute of Standards and Technology
What’s Next • Provide feedback to DON NDR authors • Provide input into Federal NDRG effort • Continue to expand our NDR rule-base • Continue to enhance software based on user requirements • Produce a Toolkit for XML Schema developers • Identify collaborators for future work National Institute of Standards and Technology
Summary • A process for XML schema development is necessary • Tools can automate the process thereby reducing labor and deployment time • Definition and implementation of NDR’s is non-trivial but necessary to support reuse of schemas • Use of NDRs will ultimately make XML schemas more acceptable to a wide audience National Institute of Standards and Technology
Contacts • KC Morris – kcm@nist.gov • Simon Frechette – simon.frechette@nist.gov • Serm Kulvatunyou – serm@nist.gov • Josh Lubell – lubell@nist.gov • Puja Goyal – pgoyal@cme.nist.gov References • http://www.mel.nist.gov/msid/xml_related.htm • Morris, KC , Kulvatunyou, Boonserm, Frechette, Simon , Lubell, Joshua , Goyal, P. , XML Schema Validation Process for CORE.GOV, NISTIR 7187, (2004) National Institute of Standards and Technology
Schema Quality of Design Testing Tool Screen Shots National Institute of Standards and Technology