520 likes | 692 Views
ISO Healthcare Data Types: Interactions with Metadata Registries. Metadata Open Forum 22-05-2008 Grahame Grieve. Overview. History & Overview of ISO 21090 Using ISO 21090 data types with 11179 registries. History of ISO 21090. Project chartered long ago (> 10 years)
E N D
ISO Healthcare Data Types: Interactions with Metadata Registries Metadata Open Forum 22-05-2008 Grahame Grieve
Overview • History & Overview of ISO 21090 • Using ISO 21090 data types with 11179 registries
History of ISO 21090 • Project chartered long ago (> 10 years) • Held up repeatedly by disagreements between stakeholder SDO’s: • ISO • CEN • HL7
HL7 • Health Level 7 (from ISO OSI) • US-based ANSI accredited standards developer • Membership: • Vendors • Providers • Government Programs • Consultants • Academics
Scope of HL7 • Administrative Healthcare Data • Healthcare financial information • Clinical/EHR data • Clinical Trials & Clinical Safety • Clinical Decision Support • Enterprise architecture • Technology bindings (layers 4-6) • Whatever the members want!
HL7 v3 • HL7 is in the process of completely updating it’s approach • V2 (old): • ad hoc modelling • single technology binding • widely used • V3 (new) • rigorous consistency / not semantically ambiguous • high level of reuse / multiple technology bindings • uptake starting to gain critical mass
Overview of V3 RIM (Reference Information Model) Constraint Models (DMIM, DIM, RMIM, HMD, MT, CIM, LIM, TIM) Wire Formats (XML ITS) Solid Ontological Foundation Adaptation to a context of use How to make it work Data Types Concept Implementation Terminology Definitions (a la ISO 704)
CEN 13606 RIM (Reference Model) Part #1 Archetypes Part #2 Exchange Format Part #5 Solid Ontological Foundation Adaptation to a context of use How to make it work Data Types Concept Implementation Terminology Definitions (a la ISO 704)
HL7 / ISO data types • The data types are ubiquitous • All actual data is stored in data types • Plus content model bindings • RIM is a general model for everything • The data types are where the pedal hits the metal • The data types are designed with rigor for rebust re-use in multiple contexts
Data Types Requirements Massive requirements gathering: • v2 semantics must all be supported • Years of requirements gathering • many MBs of archived meeting minutes, wiki content, ballot reconciliation, emails, skype chats • review by other standards organisations, experts • most available publicly, but not gathered together
What’s the Political Discord? • HL7 V3 Data Types highly controversial • Improper UML diagrams • Modeling by constraint • Gap between specification and implementation • Open war about null flavors • CEN rejected V3 Data types • Strong input from OpenEHR • Several candidate standards proposed • ISO: Several candidate standards proposed • A mess: personal, and a disaster
Specification and Implementation • Different wire encodings • XML • ASN.1 • Different layers • XML • Object Model • Different Paradigms • Web 2 • SOA
Specification and Implementation • Defining a specification in terms of any one layer creates problems: • instability • binding/bridging problems • Implementer choice • OMG • PIM: Platform Independent Model • PSM: Platform Specific Model
Abstract & XML: Gap • HL7 Data types are specified in an abstract form (defines everything from scratch, except true and false) • The abstract form is very useful as a rigorous semantic statement • The abstract form is completely useless as a basis for implementation • An XML form was defined for implementation
RM-ODP View points • Enterprise - purpose, scope and policies of system • Information – kinds of information, and constraints & use of it • Computational - functional decomp-osition into objects that interact • Engineering - infrastructure required to support system distribution • Technology - technology to support system distribution
RM-ODP Mappings • Enterprise • Information • Computational • Engineering • Technology • Abstract Data Types • XML ITS Data Types
RM-ODP Mappings • Enterprise • Information • Computational • Engineering • Technology • Abstract Data Types • XML ITS Data Types Missing!
11179 • Data Element Concept is abstract • Data Element is partially technology bound – neither one or the other • 11179 Attributes specify “representation”, not semantics • Representation is not sufficiently technology bound to allow semantic interoperability • Representation is not sufficiently technology bound to allow technical interoperability
11179 • Example
ISO Data Types • A computable/implementable standard • Full UML specification (PIM) • Rigorous XML representation (PSM) • Proper implementation of abstract data types • Mappings to other major alternatives • Acceptable to everyone • Grudging acceptance! • The ISO data types are still full of compromise
ISO Data Types - Progress • Work began in 2005 (conceived in UK) • Combined data types are currently undergoing simultaneous ballot by • CEN • ISO (DIS) • HL7 • Combined resolution Istanbul in October • Implementation trials are under way
In the data types... • Infrastructural things (Lists, Sets, Booleans) • Text & multimedia content • Coded Concepts / Terminology References • Identification • Names and Addresses • Quantities • Intervals and Complex Set Expressions • Uncertain Values
CD: Concept Reference • Code – umbiguous reference to a concept defined in a terminology • ValueSet – reference to a value set from which the concept was allowed to come • Display & Original Text • Coding Information – information about how this was coded • Translations – alternate concept references
Relationship with 11404 • 11404 defines General Purpose Data types (GPDs) • 21090 builds on this to define healthcare specific data types (HDTs) • Why does healthcare need data types? • Why does it define data types with the same name & function as GPD data types?
What’s a Data Type? • Multiple Answers • In this context: • What you use for an attribute type • Has no identity, so cannot have an association • does not have associations (actually all associations are compositions)
What’s a Data Type? • Only has a value (No identity) • Defined in terms of possible operations • Operations yield new values – the original value does not change (Immutable) • Does not have state – the value is unchanging in all contexts (No life cycle) • Is a single unit of semantic coherency
Why Healthcare Data Types? • Records do not have the desired semantic rigour • Need objects with defined life cycles • Defining life cycles is inappropriate for some basic concepts • Define them as a black box • Specify their operations • Treat them as re-usable value domains • Data Types!
Why Healthcare Data Types? • 11404 concepts: • Strings, Integers, Reals, Booleans • Sets, Lists, Bags, Intervals • Higher Level Concepts: • Terminology Bindings • Physical Values/Measurements • Inclusion of externally defined content • Specification of mathematical sets • Names, Addresses, Identities
Why Redefine 11404 Types? • 21090 defines types that represent the same concept as 11404 types • SET • LIST • BAG • INT • REAL • Why do this?
NullFlavor Incomplete / Partial Data is common in healthcare • Patient is unconscious • Patient was unwilling to provide information • Patient isn’t sure when the operation was • Data is not (yet?) available • User does not have access to the information • Code system doesn’t contain concept • A trace detected, but not enough to quantify
nullFlavor • Every data element has a nullFlavor attribute • Semantically, the nullFlavor values are part of the value domain of every type or class • Data elements that have a nullFlavor may also have other information • nullFlavor is not the same as null in technologies such as OCL and SQL • But behaves a lot like them
Primitive Type Pattern • Use types in 11404 • Need to give them all nullFlavor • nullFlavor defined on base type ANY • Define a class with a standard name • specialize ANY • give it an attribute “value” • “value” type is the appropriate 11404 type
Why redefine 11404 types? • 21090 defines types with similar names and functionality to 11404 • 21090 does not redefine 11404 types • It wraps them using the primitive type pattern • Causes much confusion
11179 Registries & Data Types • Is it appropriate to use the health care data types as a type in a 11179 registry?
11179 Attributes of interest • Datatype • Representation Category • Minimum Size • Maximum Size • Layout of Representation (n(3).n(3)E2) • Permissable Data Element Values
Datatype • The format used for the collection of letters, digits, and/or symbols, to depict values of a data element, determined by the operations that may be performed on the data element. • The data type does not define the operations, only the representation • What defines the operations?
11179 Attributes of interest • Datatype • Representation Category • Minimum Size • Maximum Size • Layout of Representation (n(3).n(3)E2) • Permissable Data Element Values
Representation Category • “Type of symbol, character or other designation used to represent a data element.” • The representation category shall be specified by the relevant standard
Representation Category • Representation Class: An informational list of representation class terms is provided: • Code / Text • Count / Quantity / Currency • Date / Time • Graphic / Icon / Picture • “By using representation class, enhanced semantic control over the contents of value domains can be maintained” • 11179 definitions are loose: core things are not nailed down
Maximum Size • The maximum number of storage units (of the corresponding datatype) to represent the data • This is confusing • Name has type character string, and a size of 80 • So you can have 80 names?
Health Care Data Types • The Health Care Data Types subsume these several concepts • Define other attributes that are required • Cardinality • Concept bindings • Min & Max Length • Mandatory / Required / other constraint levels
Health Care Data Types • 11179 allows other attribute types that can define these other things • Render existing mandatory and conditional attributes irrelevent or misleading • Can use health care data types without loss of meaning since there isn’t much in the first place • i.e. Are the types composite types?
11179 & External Data Types • Same issues apply to use of 11404 General Purpose Data types • And also W3C Schema types • Editor’s draft allows for referencing a particular data type specification • Need a profile for each specification
Describing Data Types in 11179 • Is this even appropriate? • Attributes that make up data elements are not re-usable in other contexts • Intent and definition is completely context bound • General Issue: HL7 specifications define re-usability at a much higher level than data element level
Describing Data Types in 11179 • General Issue: Need to make formal (computable) statements about • Operations • Life Cycle • Constraints • Conformance Requirements • 11179 allows some of this, but provides no formal framework • HL7 rolls it’s own xml syntax
OWL & Healthcare Data Types • There is ongoing interest in representing the healthcare data types in OWL to facilitate formal logic • The health care data types are put of a larger web: • Terminology & Value Sets • Data Types & Reference Classes • Constrained Models for use cases • Process models (Services etc) • No comprehensive approach yet