500 likes | 611 Views
Evolving Health Informatics Semantic Frameworks and Metadata-Driven Architectures. Jim Davies, Jeremy Gibbons, Steve Harris and Denise Warzel.
E N D
Evolving Health Informatics Semantic Frameworks and Metadata-Driven Architectures Jim Davies, Jeremy Gibbons, Steve Harris and Denise Warzel
It commonly happens that there is no one conclusive study but 'meta-analysis' gives you a combined result from every randomised trial that has been done around the world. For example, the drug Tamoxifen an oestrogen blocker that may prevent breast cancer cells growing was the object of forty-two studies world-wide of which only four or five had shown significant benefits. But this did not mean that Tamoxifen did not protect against breast cancer. 'When we put all the studies together it was blindingly obvious that it does you don't have to be a medical statistician to see that. Nor do you need to be an economist to see the advantages of saving tens of thousands of lives with this inexpensive drug’ Prof. Richard Gray, University of Birmingham
early breast cancer trialists’ collaborative group • Initiated in 1983 • Hundreds of institutions worldwide • Consensus on 30 variables, a data model and submission format • Analyse data every 5 years • Computable data and follow-up for 200,000 cases in the 2000 review • Rock-solid evidence base for the use of Tamoxifen in early breast cancer
cisplatin in ovarian cancer • 80 studies • 4000 patients • no conclusions could be drawn • simply not enough metadata
persistence of data • Richard Doll published the first statistical analysis proving smoking causes cancer in 1956 • in 1958 Kilby and Noyce filed patents on integrated circuits heralding in the information age • today is 40th anniversary of the mouse
Semantic Frameworks • tools and approaches facilitating the • creation of semantically well characterised data • creation and configuration of software to collect and analyse such data • architectures that can reflect on the metadata and act accordingly • prospective and retrospective
getting more value out of data MetaDataRegistrIES
metadata registries • stores definitions of common observations and attributes • attaches value domains and measurement protocols to • concepts from relevant vocabularies • measurable properties • gives value domains meaning • allowed ranges • value partitions • enumeration lists
ISO/IEC 11179 part 3 • a standard for metadata registries • metadata registry -> a collection of metadata elements • metadata element -> a template or recipe for an observation or measurement; a specification for a class attribute; database or spreadsheet column • identifier; names; informal semantics; navigational links; valid values; terminological/conceptual analysis; data type; units • can hide context inside informal semantics: density in kg/m3 at STP • US NCI caDSR; Meteor; US EPA
office plugins • allow user access to knowledge resources at the point of need • word: terminology • excel: terminology and metadata elements • UML modelling: terminology, metadata elements and model fragments • infopath : terminology and metadata elements and document fragments
US National Marrow Donor Program (NMDP) • blood stem cells from Bone Marrow or Cord Blood treat life-threatening diseases: Leukemia and Lymphoma • sources for cells: • Marrow, Peripheral blood stem cells (PBSC), umbilical cord blood • over 13 million donors worldwide • 90,000 cord blood units • 6,000 searches per day
donor registry • conformance Testing • match based on HLA Antigens: A, B, DRB1 Good: 5 of 6 + Antigens: C, DQ Match: 9 of 10 • choose source that best matches patient • 70% of patients won’t find a suitable match in their family
data sharing and curation • the NMDP wants to annotate its forms with the NCI resources • 100 complex forms need to be mapped with metadata elements and concepts from caDSR and the EVS • form data elements described in Excel • excel plug-in is used for annotation rather than data entry • XML representation of mapping recovered from the spreadsheet once complete
federation of metadata registries • registry adds value but it’s the metadata elements that really matter • ISO11179 identifier is globally unique • common registry meta-model = registry interoperability • demonstrated between caDSR and the cancergridmdr • metadata should accompany data wherever possible • metadata elements are shared across a community by subscription • a standard for metadata element messaging? • granularity: whole metadata element; value domain only? • are formal semantics local? • is the standard sufficient, or are there other considerations? • trust relationships between registries • can we express them? • how do we represent them?
cgMDR caDSR
accelerating clinical research Generating software
meta-models of experiments • standardise models/designs • support parameterisation of services or transformation into software • combining meta-models with metadata elements removes a major source of variability from experimental designs • support experiment designing software
Trial model overview Common data elements
Trial model overview Common data elements Concept references
integration – InfoPath 2007 • in clinical trial designs • incorporate metadata element into forms • qualify the meaning of sections on forms • define eligibility criteria/stratification variables • in data capture forms • large/dynamic enumerated value domains: drug names; identifiers for clinical centres; ontology sub-trees
data sharing service • reusable tools and approach for data sharing resources • derive meta-models of experiments of interest • Parameterised by data elements and concepts from terminologies and ontologies • use these to create metadata records for the catalogued experiments • users can search records, navigate with data-types and concepts locating data of interest
metadata registry XSLT dates, text boxes, id lookups XQuery + XSLT Instance and submission XML Schema Raw XForm generator generation uml profile eap ->xmi -> xml config registered view transformations Installed XForm Viewed XForm
<xforms:input ref="//cgMDR:change_description"> <xforms:label>change_description</xforms:label> </xforms:input> <xsl:template match= "xforms:input[ends-with(./@ref,'cgMDR:change_description')]"> <xforms:textarea ref="//cgMDR:change_description"> <xforms:label>Change Description </xforms:label> </xforms:textarea> </xsl:template> <xforms:textarea ref="//cgMDR:change_description"> <xforms:label>Change Description</xforms:label> </xforms:textarea>
ACWY • information system support for meningitis vaccination trial • metadata elements registered in local registry • registry schema generated to provide semantically annotated XML Schema elements • form schemas authored in XML editor • XForms generated from schema • deployed to eXist using ANT • taken forward in SharePoint/Silverlight
Infectious Disease Control • in SARS the infectious agent was identified and sequenced in 5 days in 2003 • with the tools becoming available we will need to communicate instrumental parameters, dataset definitions, new terminologies and ontologies automatically • these tools could allow systems to be developed when little is known about the precise threat
conclusions • working on a range of tools to allow the scientist user to manage their metadata • these tools provide a framework for the characterisation and persistence of meaning within a data community • tools work best when they are embedded in users’ working environment
acknowledgements • Oxford: Andrew Tsui, Charlie Crichton, TianyiZang, AadyaShukla, Peter Wong • Cambridge: Lorna Morris, Irene Papatheodorou, James Brenton, Carlos Caldas • UCL: Igor Toujilov, Sylvia Nagl • NCI: Christophe Ludet, Denis Advic, Charles Griffin • Scottish Government: Peter Winstanley • NHS: Nicholas Oughtibridge • English Government: Paul Davidson