1 / 15

Largest US Healthcare Dataset in Hadoop enables Patient-level Analytics in Near Real Time

Learn how IMS Health utilizes the largest US healthcare dataset in Hadoop to enable patient-level analytics in near real time. Discover the opportunities, challenges, and lessons learned in making a greater difference in patient healthcare.

reynaldoh
Download Presentation

Largest US Healthcare Dataset in Hadoop enables Patient-level Analytics in Near Real Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Largest US Healthcare Dataset in Hadoop enables Patient-level Analytics in Near Real Time September 28, 2016 Navdeep AlamDirector of Data Warehousing nalam@us.imshealth.com

  2. Agenda • Who is IMS Health • Health care data ecosystem at IMS • Opportunity and Challenges: Make a Greater Difference in Patient Healthcare • Solution – Anonymous Patient Longitudinal Analysis • Lessons Learned

  3. Who is IMS Health?

  4. Health Care Data Ecosystem IMS Health – Where Does Our Data Come From

  5. Future Data Growth is Exponential Social Media, IOT, Genomics Billions More Transactions Billions of Anonymous Patients

  6. Make a Greater Difference in Patient Healthcare Precision Medicine, Better Outcomes, Propel Research towards Cures • Longitudinal Studies • Find Patterns Across All Patients • Predict and Influence Outcomes • Help Reduce Healthcare Costs • Clinical Trials and Drug Research Improvements • Improve Provider Care

  7. Challenges Obstacles to Realizing the Greater Opportunity • Data Silos • Reduced Data Currency • Analytics Away from the Data • Analytics Too Time Consuming and Expensive • Cost High on Current Systems

  8. Solution - Patient Longitudinal Records Organized for Fast Access and Reduced Data Shuffle Traditional Warehoused Data Big Data Factory Each color = Unique de-identified patient ID. Each shape = A type of patient data. Filled shapes = Data of interest Complex Nested Data Type

  9. Solution - Different Storage Engines • Aggregates/Counts • Web Speed (ms) • Faceted Search Storage to Match the Access Pattern Solr Complex Nested Type Web Applications • Fast lookup of longitudinal Entity (i.e. Patient) HBase HUE RDBMS ETL Process Rest • Deep Learning Analytics • Longer Running Queries (min vs. days) Hive Nested Bucketed JDBC/SQL ETL Process • BI/DW Workloads • SQL Hive Partitioned

  10. Hadoop Storage Engines Parquet/Hive vs. HBase vs. Kudu

  11. Evolution of Different Storage Engines Storage to Match the Access Pattern with Kudu Complex Nested Type • Aggregates/Counts • Web Speed (ms) • Faceted Search Solr Web Applications HUE • Fast lookup of longitudinal Entity (i.e. Patient) RDBMS ETL Process Rest • Deep Learning Analytics • Longer Running Queries (min vs. days) JDBC/SQL Kudu • BI/DW Workloads • SQL

  12. Anonymous Patient Longitudinal Analysis Rx (Prescriptions) and Dx (Medical Claims) Longitudinal Analysis

  13. What does this do for us? ValueProposition • See Patterns in Data • Explore the Data Before Analysis • Variety of Analysis in Parallel • Time-to-Value Greatly Increased • Reduced Cost • Innovation

  14. Lessons Learned Technology, Cultural, and Process Management Changes • Rethink Everything!

  15. Thank You Navdeep AlamDirector of Data Warehousing nalam@us.imshealth.com

More Related