230 likes | 322 Views
Phenotype Capture in Genetic Variant Databases. Peng Chen School of Computer and Information Science chepy049@mymail.unisa.edu.au Supervisor: Dr Jan Stanek Research Fields: Health Informatics
E N D
Phenotype Capture in Genetic Variant Databases Peng Chen School of Computer and Information Science chepy049@mymail.unisa.edu.au Supervisor: Dr Jan Stanek Research Fields: Health Informatics Health Computer Science Health Information System
Motivation Research Question Literature Methodology Phenotype Data Review Result The openEHR Archetypes Review Result Phenotype Capture Experiment Result Conclusion Outline
Motivation • 1950s health computer science, EHR (Electronic Health Record) • Slow development • Bio-medical research & EHR systems • Genotype – Phenotype correlation
Can the existing standard openEHR be used to capture and store phenotype data/clinical data? Hypothesis one: most of the phenotype data in genetic variant databases is not coded, has little clinical details, not stored in a consistent manner. Hypothesis two: openEHR is potentially suitable to store phenotype data as a standard. Research Question
Claustres et al. (2002) ‘Time for a Unified System of Mutation Description and Reporting: A Review of Locus-Specific Mutation Databases’ Mitropoulou et al. (2010) ‘Locus-specific database domain and data content analysis: evolution and content maturation toward clinical use’ Spath & Grimson (2011) ‘Applying the archetype approach to the database of a biobank information management system’ Chen et al. (2009) ‘Archetype-based conversion of EHR content models: pilot experience with a regional EHR system’ Literature
Criteria form for phenotype review Methodology
Methodology • The openEHR phenotype capture model
Data integration workflow towards a proposed health care EHR integration architecture Methodology
Reviewed 1224 databases, 978 collect phenotype, all stored in internal storages. 40 (4.1%) has formal terminology, 30 (3.1%) has formal coding. 959 (98%) store low-granularity phenotype data. 604 (62%) were curated by experts. 534 (54.6%) store single phenotype data, 444 (45.5%) store multiple phenotype data. 757 (77.4%) store phenotypes on case basis, 221 (22.6%) on variant basis. Database: Phenotype Data Review Result
Phenotype samples: Sample 1: ‘MRX’, ‘ARRP’, ‘AMD’, ‘arCRD’, ‘CIPA or HSN IV (H406Y + G613V are polymorphisms)’, ‘Type I, type II, non syndromic recessive’ Sample 2: ‘Failure to thrive; Pneumocystis carinii pneumonia; Diarrhea; Marked lymphopenia’ Sample 3: Phenotype Data Review Result
Reviewed 283 existing openEHR archetypes Multilingual translation mechanism Term binding mechanism The openEHR Archetypes Review Result
The openEHR Archetypes Review Result • Multilingual translation mechanism - example ontology terminologies_available = <"SNOMED-CT", ...> term_definitions = < … ["zh-cn"] = < items = < ... ["at0004"] = < text = <"收缩压"> description = <"一个血液循环周期中,系统性动脉血压高峰值。 收缩期血压"> … ["de"] = < items = < ... ["at0004"] = < text = <"Systolisch"> description = <"Der höchste arterielle Blutdruck eines Zyklus - gemessen in der systolischen oder Kontraktionsphase des Herzens."> … ["en"] = < items = < ... ["at0004"] = < text = <"Systolic"> description = <"Peak systemic arterial blood pressure - measured in systolic or contraction phase of the heart cycle."> > (ADL display)
The openEHR Archetypes Review Result • Multilingual translation mechanism - compare
The openEHR Archetypes Review Result • Term binding mechanism term_bindings = < ["SNOMED-CT"] = < items = < ["at0000"] = <[SNOMED-CT(2003)::163020007]> ["at0004"] = <[SNOMED-CT(2003)::163030003]> ["at0005"] = <[SNOMED-CT(2003)::163031004]> ["at0013"] = <[SNOMED-CT(2003)::246153002]> > > (ADL display)
Phenotype Capture Experiment Result • The chosen sample: • The mapping of concepts:
The openEHR archetypes mapping: Evaluation Diagnosis Observation Symptom Action Treatment Phenotype Capture Experiment Result
Phenotype capture snapshots: Phenotype Capture Experiment Result
Phenotype capture snapshots: Phenotype Capture Experiment Result
Phenotype capture snapshots: Phenotype Capture Experiment Result
Phenotype capture snapshots: Phenotype Capture Experiment Result
The research results have justified the hypotheses and have matched the expected outcomes The openEHR standard is potentially suitable for storing clinical data, even for integrating health information systems. The multilingual language mechanism and term binding mechanism are two strong evidences for semantic interoperability between heterogeneous systems. We need international cooperation on managing the archetypes and completing a full set of archetypes for health concepts. We need international agreement on choosing terminologies and enhancing the terminologies for resolving semantic conflicts. Conclusion
The philosophy and the future A health care EHR integration architecture Conclusion Archetype-ontology Cognitive IS Human friendly Robust, scalable, integrated Semantic interoperability Syntactic consistency Data modelling neutral Start from learning terms and concepts IS essentially for communication Ubiquitous information computing