90 likes | 213 Views
ACORD’s Experiences using W3C Schemas. Dan Vint Senior Architect dvint@acord.org. Background. ACORD Not-for-profit Insurance industries standards body, we have standards for Life, P&C and RLC lines. Used in US, Canada, England/London Market, and Europe. Growing interest in Japan/China
E N D
ACORD’s Experiences using W3C Schemas Dan Vint Senior Architect dvint@acord.org
Background • ACORD • Not-for-profit Insurance industries standards body, we have standards for Life, P&C and RLC lines. Used in US, Canada, England/London Market, and Europe. Growing interest in Japan/China • Three standards started in 3 different ways: • P&C – based upon an earlier EDI standard. Originally designed for DTD use, but added schema support when it was available (currently using both) • Life – started as a MS initiative then brought into ACORD. DTD based originally, now only using schema • RLC – started in a European standards group, also based upon an EDI standard. Using DTDs currently trying to add schema support • Business Issue: How to bring the 3 standards into one design • For ACORD the best thing would be a common model, design and naming conventions • Can’t get there because we are on different cycles. Life and RLC cannot break backwards compatibility, P&C is currently redesigning • Each as a level of adoption and entrenchment.
Things that we like: • Having the schema format be an XML document has been very useful • We generate schemas and DTDs from a meta-format that is maintained in a database • We slice the schema into smaller subsets via XSLT • Support for basic data types was crucial for our needs. • Having a built-in documentation standard is very nice. • Support for namespaces and import/include/redefine is useful, but rough around the edges. • Can we get this sort of support reversed engineered into XML and DTDs?
Things we don’t like about W3C Schema • Took a simple 30 page standard and blew it out of proportion. • XML took a simplifying approach, schema reversed that and put in experimental features and complexity • Fallout: • You need tools to manage a schema • No two tools agree on their implementations • Incomplete implementations • Should have left out xsd:all, DTDs did and Schema implementation is not generally usable anyway. • I can teach DTD syntax in an hour or less; it takes a day for schema • Entities requiring the use of DTDs. • “Yuch! DTDs, that’s old technology and a kludge” Surely if the schema guys had wanted entities to be used they would have defined a new method (or they wouled have at least documented this use.
Things we don’t like about W3C Schema (cont) • No way to clearly specify the Schema • Hint attributes! • Confused world of support and no support for the hints, inconsistent on what should happen when they appear • No way to version schema that is supported by tools • xsd:version attribute has no use or support • Namespaces only way to force recognition of a change • Namespace designation of extension – want a different namespace rather than the original ACORD. (namespace pollution) • Extensions, did content really have to be added outside the original choice or sequence? • Extensibility of enumerated lists • Probably 1/3 of the elements in ACORD are based upon code lists, and we can’t extended them. • Need a way to identify the extended code values. We have our tpe based upon QName, but this doesn’t work for external lists.
Things we don’t like about W3C Schema (cont) • No easy way to both validate or ignore extensions • We need extensions, but not all trading partners will know what the extensions are, so they need to at least be able to validate the basic ACORD information and structure • We have two designs within our standard • One uses global elements and groups with restriction and extension • Another uses an element of <Extension> that is of type xsd:any and is added at the end of our aggregates • How about a processContents="strict“ or processContents=“lax" • We could use co-constraints and/or “business rules” • We publish a huge schema – 700 messages, 29,000 lines need help in modularizing to make the smallest schema possible • Business level validation – just because a date is the wrong format, should I reject a $50 million transaction? • Don’t like substituionGroups because it is not easy to determine the final content model • I like the old SGML notion of “no surprises”
Where we need help • Need an integrated approach that supports all tools • Code generators my biggest headache currently – don’t support all features, don’t expect to and have decided what subset of XML they think is appropriate • Groups • Redefine • Attributes on data level elements • Too many levels of inheritance in classes – create complex code • ‘abstract’ • Possible solution – profiles • XML DTD level support plus data types (import, include, namespaces) • DTD+ with base types (abstracts, redefine) • Schema extreme (key, unique, substitution groups) • Should the W3C consider specifying a language that uses XML as a data type? Scripting or compiled • XML should become a native data type • Consider the adoption of a catalog approach to schema location, general processing, and versioning • One standard has a wide content model for messages and uses a type code to designate the type of message being sent • Everything is optional in the core design of elements • We need a very rich method for managing co-constraints based upon several hundred type codes • Would really like to somehow get auto-magic schema validation of these real message requirements • We want to standardize on the resulting data stream when it comes to extensions • Need a way to prevent the use of features we don’t want to support.
Where we need help (cont) • Reuse of other schemas • Good intentions, but impossible to do unless designed at the same time • Design principals and modeling techniques • Based upon different datatypes – with/without id attributes • I want to share part C of an outside schema, outside schema has a definition for Party and it is used in C, I also have a Party definition and I need to substitute it in part C to make it useful • No easy way to take a part of the other schema unless it is in a subset file • All I want is codelist A • All I want is the Party definition • Need to promote a standard way to define codelists and encourage standards producers to publish them in XML based upon that standard. See previous bullet. • Want extensions to be identified against the extender, not the originator • Propose a new attribute "hideBase" to manage “namespace pollution”
Why ask why? • Why do xml special attributes require definition in schema, but not DTD? • xml:space requires and include file, but xmlns is automagically there! • How will the xml:id attribute be implemented? • Why is a simpleType not always simple? • Why does adding an attribute make a simpleType become complex with a simple content?