1 / 25

Connecticut is Data Rich but Information Poor

Connecticut is Data Rich but Information Poor. Our Vision: Connecting the Silos. How PATH Works Example of PATH installed as P20WIN PATH vs Desktop Integrator. PATH Presentation CT Data Collaborative June 2014. Virtual Data Warehouse

craig
Download Presentation

Connecticut is Data Rich but Information Poor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connecticut is Data Rich but Information Poor

  2. Our Vision: Connecting the Silos

  3. How PATH Works • Example of PATH installed as P20WIN • PATH vs Desktop Integrator PATH PresentationCT Data Collaborative June 2014

  4. Virtual Data Warehouse • Identity Resolution across multiple sources that don’t share a Gold Standard Identifier • HIPAA and FERPA Compliant • Always transfers Fact data separately from Demographic data or Personally Identifiable Information • Data Owners control which data is exported to a location outside of their data center • Data Owners approve all queries How PATH Works

  5. Completed Phases • 2007 - Established in Statute - Public Act 07-02 • 2008 - Initial Development as CHIN, inclusion of 4 initial data sources • 2009 - Implemented advanced record linkage in a virtual data warehouse • 2011 - Scalability to 1M+ individuals, ability to add additional data sources and manage metadata w/o code modifications, unlimited data sources • 2014 - Implemented for P20WIN 40M Records, 1.6B Data Elements Now Available to CT Agencies and Organizations as PATH PATH History

  6. People Records • Demographic Information such as Name, Address, SSN, DOB, etc. • Also known as PII – Personally Identifiable Information • Fact Records • Education, Health, Labor, etc. Information about a person BUT without the PII information • De-Identified or Anonymized Data Data Categories

  7. A Walk Through of How PATH Works P20WIN Example

  8. PATH Remote Software installed at each Participating Agency • Agency Data Steward uses the PATH Metadata Editor to Identify: • Table/Record Schema of Agency Data • Data at the Field or Table Level marked Available or Unavailable for Download • Common Data Element fields used for linking records - provides Identity Resolution across the different sources Agency Data Agency Data Agency Data Agency Data Step 1 SDE CCC Metadata Editor & ETL CSU Metadata Editor & ETL DOL Metadata Editor & ETL Metadata Editor & ETL

  9. During Remote Initialization the Extract/Transform/Load function of PATH builds a Record Index of the People Records from each Data Source Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 2 SDE CCC CSU Record Index DOL

  10. PATH Software installed at a Main Location - for P20WIN this location is DAS/BEST Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Record Index Agency Data Step 3 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  11. During Main Initialization Using each Agency’s Record Index, Extracts Common Data Elements from People Records Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 4 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  12. During Main Initialization Using each Agency’s Record Index, Extracts Common Data Elements from People Records Sends them to Main & Loads into Memory ONLY Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 4 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  13. During Main Initialization Extracts Common Data Elements from People Records using each Agency’s Record Index Sends them to Main & Loads into Memory ONLY Combines multiple records for individuals into Clusters via Probabilistic Integration Utility Table of Clusters containing only Agency Record Indices remains in memory Agency PII flushed from memory Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Record Index Agency Data Step 4 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  14. Use UI features to establish user Roles, Login, etc. Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 5 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  15. Use UI features to establish user Roles, Login, etc. • Use UI features to: • Create a Query • Approve a Query • Schedule a Query Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 5 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  16. Use UI features to establish user Roles, Login, etc. • Use UI features to: • Create a Query • Approve a Query • Schedule a Query • Use Query Engine to: • Build Agency Query Requests • Uses ONLY Data Available for Download in Query Request Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 5 SDE CCC CSU Record Index Main @ DAS/BEST DOL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

  17. SDE Query Engine uses Clusters of Indices to Get the needed Agency Records Indices Queries Only Agency Data marked Available for Download Transfers only data marked Available for Download to the Main Downloads Only Approved Queries Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index Step 6 CCC CSU DOL Record Index Main @ DAS/BEST Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine De-identified Integrated Data

  18. 3 User Roles

  19. Query Workflow

  20. Data Output

  21. Remote Components • Metadata Editor • Extract, Transform and Load Module • Main Components • Integration Engine • User Interface • Security • Workflow Module • Query Engine with Filtering Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Agency Data Record Index PATH Components Record Index Main @ DAS/BEST Metadata Editor & ETL Probabilistic Integrator - Pi Integration Engine ` UI, Security, Workflow, Query Engine UI, Security, Workflow, Query Engine De-identified Integrated Data

  22. Security • Personally Identifiable Information never written outside of Agency Data Center • Encrypted transfer of all data • PII & Fact records never transmitted together • Audit logs • Query Approval Workflow • Multiple Secure User Roles • Ease of Use • System Administration • Data Management • Query Filtering • Query results delivered as de-identified data Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Record Index Agency Data PATH Functionality Data Mgmt Record Index Main @ DAS/BEST Metadata Editor & ETL Probabilistic Integrator - Pi Integration Engine ` Encrypted Xfer PII & Facts separate Xfer No PII User Roles UI, Security, Workflow, Query Engine UI, Security, Workflow, Query Engine Audit logs Sys Admin De-identified Integrated Data Approval req’d No PII Query Filtering

  23. Remote Components • Metadata Editor • Extract, Transform and Load Module • Main Components • Integration Engine • User Interface • Security • Workflow Module • Query Engine with Filtering Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Agency Data Agency Data Agency Data Record Index Record Index Record Index Agency Data Competitor Components Data Mgmt Record Index Metadata Editor & ETL Integration Engine ` Encrypted Xfer PII & Facts separate Xfer No PII User Roles UI, Security, Workflow, Query Engine UI, Security, Workflow, Query Engine Audit logs Sys Admin De-identified Integrated Data Approval req’d No PII Query Filtering

  24. Desktop Integration Engine • Minimal Security • No Encrypted Transfer of Data • No Audit Logs • Transfer of Facts with PII • No Secure Logins • FTP or Thumb Drive Transfers • No Anonymized Data • No Access Control - No Approval Workflow • No Chain of Custody Assurance – Possibility for Cherry-Picked Data Agency Data Agency Data Agency Data Agency Data Competitor Deficits Integration Engine ` Copies of Agency Data PII Visible Integrated Data

  25. Take a Test Drive

More Related