200 likes | 369 Views
SHARPn Data Normalization. November 18, 2013. Data-driven Healthcare. Big Data . Analytics. Domain Pragmatics. Research. Practice. Experts. Knowledge. A framework for clinical data reuse. Production Systems. Production Databases. Replicate. Replicate. Query. Data Analytics
E N D
SHARPn Data Normalization November 18, 2013
Data-driven Healthcare Big Data Analytics Domain Pragmatics Research Practice Experts Knowledge
A framework for clinical data reuse Production Systems Production Databases Replicate Replicate Query Data Analytics NLP and Data Normalization Enterprise Repository/ Data Warehouse Workflow or goal specific Workgroup Datamarts Query Query
SHARPn Data Normalization • Goals • To conduct the science for realizing semantic interoperability and integration of diverse data sources • To develop tools and resources enabling the generation of normalized EMR data for portable and scalable secondary uses
Data Normalization Target Value Sets Information Models Normalization Targets Tooling Raw EMR Data Normalized EMR Data Normalization Process
Normalization Targets • Clinical Element Models • Based on Intermountain Healthcare/GE Healthcare’s detailed clinical models • Terminology/value sets associated with the models • Using standards where possible
Normalization Process • Configuration of Model (Syntactic) and Terminology (Semantic) Mapping • UIMA Pipeline to transform raw EMR data to normalized EMR data based on mappings
Four Subprojects • Clinical Information Modeling • Value Sets Management • End-to-End Pipeline • Normalized Data Representation and Store
Secondary Use Clinical Element Models http://www.clinicalelement.com GenericStatement GenericComponent Links Core CEMs AdministrativeGender, … Severity, Status SecondaryUse CEMs Embracing the fact that data may not be able to be normalized and enabling bottom-up and top-down
Status of Secondary Use CEMs • Model specification is final • CEM Browser is in production • Manuscript is in preparation Future: Secondary Use CEMs and CEM Browser will be maintained through Clinical Information Modeling Initiative (CIMI)
NLP in data normalization • A large amount of clinical information is in clinical narratives, NLP is a critical component in data normalization • cTAKES has been wrapped into the data normalization pipeline to normalize data in clinical narratives
Data Normalization version 2 http://sourceforge.net/p/sharpn/datan/code/HEAD/tree/
DN activities after SHARPn (1) –Clinical Information Model Initiatives
DN activities after SHARPn (2) –Open Health Natural Language Processing (OHNLP) • Use of the Data Normalization information model as the base to define a Common Type System to capture basic clinical information models • Use of the Data Normalization pipeline to improve interoperability of various clinical information models
DN activities after SHARPn (3) –Clinical decision support and phenotyping • The use of NLP and Big Data for Late Binding Data Normalization • Practical implementation of Late Binding Data Normalization and Drools for real-time clinical decision support