200 likes | 343 Views
Validation of HL7 v3 instances. A post-mortem from the caCIS Implementation Dan Kokotov, Todd Parnell, 5AM Solutions. Who We Are / Acknowledgements. Enterprise Service Development for caCIS project 5AM one of three companies involved in ESD
E N D
Validation of HL7 v3 instances A post-mortem from the caCIS Implementation Dan Kokotov, Todd Parnell, 5AM Solutions
Who We Are / Acknowledgements • Enterprise Service Development for caCIS project • 5AM one of three companies involved in ESD • Other disciplines on caCIS: A&A, QA, Deployment, Documentation, … • The content of this presentation is authored by 5AM – any errors or omissions are our sole responsibility • Acknowledgements • Architecture & Analysis – John Koisch, Paul Boyes, Jean-Henri Duteau, Lorraine Constable and others • Enterprise Service Development – SemanticsBits and Agilex teams
Context – Architecture and Methodology • HL7 v3 using R2 datatypes • CDA (and possibly R1) in scope for project but out of scope for our solution • Roughly 70 RMIMs • Project-specific datatype specification with roughly 50 custom datatype flavors • Terminology Worksheet with mix of explicit and referential definitions for roughly 35 vocabulary types • XML ITS used at implementation layer • SOA infrastructure – SOAP services with WS-* • Roughly 40 WSDL interfaces in 13 functional areas • Project-specific contract/fault specification, governing reporting of business and system exceptional behavior, including contract tracing • SAIF specification methodology • A&A delivered CIM/PSM specs, in the form of RMIM models, Interface description, and accompanying documentation • Also included XSDs and WSDLs (non-normative but implied by V3 tooling and choice of ITS)
Context - Technology • Tech Stack • JSE 1.6, JEE 1.5 (JTA only) • JAX-WS, as implemented by CXF • JAXB • JPA, as implemented by Hibernate • Spring 3.0 • Tolven
Challenge • Validate incoming messages for compliance to model • Structural • Base datatype rules • Flavors* • Later in the project • Vocabulary* • Mostly beyond scope of this presentation • QA generated test cases based on PIM specs/models to validate compliance
Initial approach • Architecture: AP, AO,CO, CS • “RIM-inspired” application data model • AP<->AO: ORM (JPA2), AO<->CO: Bean mapping (Dozer), CO<->CS: XML Serialization (JAXB) • Validation • Schema validation – implicit from ITS, enforced via CXF interceptor • JPA Bean validation – “message-independent” invariants, enforced by JPA • e.g. “a patient must have a name” • “External” Bean validation - “message-specific” constraints, enforced by custom AOP interceptor • E.g. “order must have an identifier in the REPC_MT000001US RMIM”
Initial approach - problems • Uncertainty on how to decide when something could be promoted as “message-independent invariant” • Occasional duplication between JPA Bean validation and “External” bean validation • Basic R2/ISO 21090 datatype validation • Would require extensive bean validation implementation • Vocabulary compliance required definition of explicit enumerated lists from worksheet • Was not sufficient for referential definitions
R2/ISO 21090 datatypes - solution • Leverage schematron definitions of constraints embedded in official iso_21090_types.xsd schema • To do so had to overcome several roadblocks and challenges: • Embedded schematron did not have any context, as XML ITS / ISO 21090 schema only defines XSD ComplexTypes for each datatype, not a standard element • XSLT2 supports a schema type axis, but no open source Java XSLT processor implements this • Therefore, have to define context as explicit OR of possible paths to the datatype from any message of a given SOAP service • Potential recursion in datatypes makes this very tricky • Embedded schematron did not use prefixes for element names, thus they were not bound to the HL7 namespace, and schematron does not permit binding the empty prefix to a namespace • Had to use a regular expression to inject a prefix to element names in schematron XPath expressions • Embedded schematron had a variety of typos/bugs • Fixed directly in the schema • Miscellaneous (ANY type, inheritance, bugs in Xerces’ XS Schema reader)
Meet ExtractSchematron.java • Part of build-time toolchain to generate schematron for the ISO 21090 datatypes • Pseudocode: • Walk the iso-21090 schema, extract schematron annotations • “Fix” the schematron by injecting hl7: prefix to element names • Write the “abstract” schematron rule file with all the extracted schematron rules • Single sch:pattern called “abstract rules” • One sch:rule per datatype rule • Walk the service schemas, determine possible paths to a datatype • Must include paths to a datatype’ssupertype, and account for abstract types which can have xsi:type declarations at runtime • Write the “concrete” schematron rule file which references the “abstract” rules • One sch:pattern per datatype rule, whose context is the OR of all possible paths to an element which is of that datatype • For the win – regexp for injecting hl7: prefix • (^|or |and |::|/|\\(|\\|)([^@naocmspxt()&\\.\\[\\\\=*+>!\\-0-9]|n(?!ot[ \\(])|a(?!nd[ \\(])|o(?!r[ \\(])|c(?!ount\\()|m(?!atches\\()|s(?!tring-length\\(|elf|tarts-with\\()|t(?!ext\\()|p(?!lain')|x(?!si:))
ExtractSchematron – the output • “Abstract” <sch:rule abstract="true" id="IVL_PQ-0"> <sch:assert test="(@nullFlavor and not(hl7:any|hl7:low|hl7:high|hl7:width)) or (not(@nullFlavor) and (hl7:any|hl7:low|hl7:high|hl7:width))"> null rules </sch:assert> </sch:rule> • “Concrete” <sch:pattern name="concrete rules"> <sch:rule context="ns0:buildTemplateResponse/responseEnvelope/hl7:subject2/hl7:sequenceNumber/hl7:uncertainty[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplateResponse/responseEnvelope/hl7:subject2/hl7:priorityNumber/hl7:uncertainty[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplate/templateParameter/hl7:parameterItem/hl7:value[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'QTY')][@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplate/templateParameter/hl7:parameterItem/hl7:value[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')]"> <sch:extends rule="PQ-0"/> </sch:rule> </sch:pattern>
Integrating into the SOAP Stack • Applying the generated schematron at runtime • CXF Interceptor to apply the schematron • CXF Interceptor to detect if errors occurred and raise fault • Need two interceptors because we work at different spots in the CXF processing chain
Results • Were now able to successfully validate for the built-in ISO 21090 datatype constraints • With shiny new schematron facility, decided to start using it for custom validation as well • But not 100% rosy • Slow (ish) • Memory intensive • Can cause problems with Xalan/Saxon on the Classpath
The monkey-wrench • Architecture change – switch to Tolven backend • Now RP, RO, CO, CS (kind of), still using conversion for RO <-> CO • No more JPA Bean validation • Still use some “External” bean validation • Datatype Specification added, project RMIMs start using flavored datatype • One sprint later, we had 200 QA bugs for flavor validation
How to validate datatype flavors? • Fully MIF-driven • We did not have time to build this • Some off the shelf stuff was available, but not on our platform • Write all the rules by hand • Seemed painful • What if we could leverage ExtractSchematron • XML ITS does not have explicit types for flavors • But if we add them, we could annotate them with schematron rules and use ExtractSchematron to harvest them • So this is the approach the took
The approach • Each flavor derives from the base datatype by restriction • And have to do it for container types as well • Flavor definitions go into flavors.xsd, which imports iso-21090.xsd • Each flavor type is then annotated with schematron, just like iso-21090.xsd • Still have to write the actual schematron rules by hand based on Datatype specification • MIF representation not available, and OCL-schematron translation would be beyond our scope • RMIM Schema modified to reference the flavor XSD types • Because the derivation is by restriction, this is fully backwards compatible – valid instances look the same • For abstract types and flavors, can use either xsi:type or flavorId (for backward compatibility) to specify flavor
Putting it into practice • Modify V3 Generator • StaticMifToXsd.xsl modified to use flavor names in RMIM Schemas • RimInfrastructureRootToXsd.xsl modified to add reference to flavors.xsd • Both changes conditional on build-time parameter, so backward-compatible • Modify implementation • JAXB now generates Java beans for flavor types • Have to update Dozer rules and other code accordingly • Remaining challenges • Permanent home for V3 Generator changes • Possible divergence from official ITS spec • JAXB flavor beans cause a lot of overhead and over-tight coupling
Lessons Learned • Schematron is a powerful tool but complex and has limits • Cannot do vocabulary • Suffers from lack of full implementations of XPath2 and XSLT2 • XML ITS for datatypes would be better off with a separate namespace and explicit element names • In the end the HDF and HL7 modeling approach strongly require an MDA-oriented implementation, with full MIF awareness. • Everything else is a band-aid • Therefore must invest in high quality MIF-based toolchains • Validation must distinguish between object model, document, and message perspectives of HL7 v3 • Same constructs are used to address all three, but the intent and semantics are different • Validation strategies should adapt accordingly
Resources • Source code: http://caehrorg.jira.com/svn/ESD/trunk • Contact info • Dan Kokotov – dkokotov@5amsolutions.com • Todd Parnell – tparnell@5amsolutions.com