370 likes | 532 Views
QPS Lunch & Learn What's in the Data Warehouse?. July 6, 2011 Martha Newton. Objectives. Data Group staff Data storage concepts What is a Data Warehouse What’s in the Data Warehouse (Vanderbilt’s Enterprise Data Warehouse - EDW) How to get data out of the EDW. Objectives.
E N D
QPS Lunch & LearnWhat's in the Data Warehouse? July 6, 2011 Martha Newton
Objectives • Data Group staff • Data storage concepts • What is a Data Warehouse • What’s in the Data Warehouse (Vanderbilt’s Enterprise Data Warehouse - EDW) • How to get data out of the EDW
Objectives • Data Group staff
QPS Data Group • Data group provide data support for Pillar Metrics, Public Reporting, External Database participation etc. • 6staff members are in the CCI data group reporting to Renee Stiles, Director of Quality Measurement and Research and myself as Associate Director • Quality Value Analyst • Biostatistician • 2 are in Infection Control reporting to Vicki Brinkso, Director Infection Control and Dr. Tom Talbot, Chief Hospital Epidemiologist • 4 are in PM&I reporting to Jenny Slayton, Administrative Director PM&I
Objectives • Data Group staff • Data storage concepts
Data Storage Concepts • Computers store data on recording media (disk drives, memory sticks etc.) • Simplest structureto store data is thefix format flat file(data is stored atprescribed locations) • Spreadsheetorganizes data intorows and columns(no fixed positions)
Data Storage Concepts • A database (usually a relational database) is used to store the information entered into a computer system • Data is stored in multiple tables • Tables are linked(hence the termrelational)
Objectives • Data Group staff • Data storage concepts • What is a Data Warehouse
What is a Data Warehouse • A data warehouse is a database used to store data from operational systems for reporting purposes • Systems like billing, order entry or external databases • Data is loaded into the data warehouse using a process referred to as ETL (Extract, Transform, Load)
What is a Data Warehouse • Bill Inmon, President and Chief Technology Officer, Forest Rim™ Technology LLC Best known as “The Father of Data Warehousing.” • 50 books and 650 articles • Top to down approach
What is a Data Warehouse • Ralph Kimball is known as the “Father of Business Intelligence” for defining the concept behind “Data Marts” • Bottoms's up approach • Invented Star Schema • Books • Data Warehousing Toolkit • The Data Warehouse Lifecycle Toolkit
What is a Data Warehouse • OLTP = Online Transaction Processing • Optimized for dataupdates • Data is normalized • larger tables aredividing intosmaller table withrelationships • Eliminatesredundancy A Dimensional Modeling Manifesto Ralph Kimball DBMS August 1997
What is a Data Warehouse • OLAP = Online Analytical Processing • Optimized for reporting • Data is stored using star(snowflake) format • De-normalize data • Tables with more columns • Data may be repeated A Dimensional Modeling Manifesto Ralph Kimball DBMS August 1997
What is a Data Warehouse • Data mart for management of VHP TN population was modeled after Ralph Kimball approach to data warehousing • Utilized Star Schema
What is a Data Warehouse • De-normalized Dimensional tables • City, County and Region spelled out instead of using a code
Objectives • Data Group staff • Data storage concepts • What is a Data Warehouse • What’s in the Data Warehouse (Vanderbilt’s Enterprise Data Warehouse - EDW) EDW?
What’s in the Data Warehouse Hospital Billing System Physician Scheduling Billing System Physician Order Entry System CPOE Time Clock System Lab System Event Reporting System StarForms are part of the EHR Nursing Documentation System Cath Lab Documentation System
What’s in the Data Warehouse • Vanderbilt’s Enterprise data warehouse (EDW) contains the data from the systems used in care of the patient or billing of the healthcare encounter • Physician Order Entry system (Wiz/HEO) • Electronic Medical Record (StarPanel) • Nursing Documentation System (HED) • Hospital/Physician Billing System (Medipac/Epic) etc. • External databases • UHC Comparative Database • NHSN (National Healthcare Safety Network) etc.
What’s in the Data Warehouse • Medipac is an Example of a production system • Medipac is used to bill hospital charges (hospital charges cover the use of the facility, supplies, medications, nursing care etc.) • Coders review the medical record and assign codes that describe the patient’s disease • The codes for an individual patient can be reviewed in Medipac
What’s in the Data Warehouse • Medipac data is loaded into the EDW in an automated nightly process • Same diagnosis codes that are stored in Medipac are stored in a table in the EDW (MPAC_ABSTRACT_DIAG) • Diagnosis codes from all patients are stored in the same table in the EDW • This makes it possible to identify patients with specific diseases (i.e. Diabetes 250.00)
What’s in the Data Warehouse • StarPanel is Vanderbilt’s Electronic Health Record (EHR) • StarForms are electronic fill in the blank forms • The forms are saved in StarPanel to document work (i.e. procedure notes) • Optionally the data can also be saved to the EDW creating a table with the fill in the blank information
What’s in the Data Warehouse • Star form data can be saved to the EDW • The fill in the blank fields are saved in a table. • A new record is created for each new form created • The table also includes indentifying information
What’s in the Data Warehouse • Other servers • TSI data resides on DWPRD (managed by Mark O’Hara) • Legacy data warehouse is on the IBM mainframe in DB/2 (also managed by Mark O’Hara)
How to get data out of the EDW • EDW is not a single structure (1 database on 1 server) • Most of the EDW resides on an Oracle server BIPROD (managed by Eric Griffin) • EDW is managed collaboratively by Finance and IT
Objectives • Data Group staff • Data storage concepts • What is a Data Warehouse • What’s in the Data Warehouse (Vanderbilt’s Enterprise Data Warehouse - EDW) • How to get data out of the EDW
How to get data out of the EDW • EDW can be accessed with a variety of tools • Business Objects • Microsoft Access using ODBC connections • PL/SQL • SQL+, Toad • SciHealth • Through UHC interface (data is uploaded to UHC from the EDW • etc.
How to get data out of the EDW • Business Objects • Request access using form below from on EDW web site • Training classes available
How to get data out of the EDW • Business Objects • BO reorganizes data in the EDW into logical structures called Universes • This shields users from database issues like incompatible field types • EDW team offers training classes for BO
How to get data out of the EDW • Microsoft Access • Using ODBC (Open Database Connectivity) EDW tables on the Oracle Server can be linked into Microsoft Access • Tables can than be used just like Microsoft Access tables
How to get data out of the EDW • Microsoft Access continued • Tables are linked in a graphical view • SQL statement is created behind the scenes SELECT A.MED_REC_NUMBER, A.DISCHARGE_DT, B.DIAG_CD FROM MEDIPAC_MPAC_CASEMIX A INNER JOIN MEDIPAC_MPAC_ABSTRACT_DIAG B ON (A.UNIT_NUMBER = B.UNIT_NUMBER) AND (A.VISIT_NUMBER = B.VISIT_NUMBER) WHERE (((A.DISCHARGE_DT) Between #1/1/2011# And #12/31/2011#) AND ((B.DIAG_CD)="250.00"));
How to get data out of the EDW • PL/SQL • PL/SQL Developer is a database development tool (Write/save SQL statements) • Tool used by the EDW team to develop the ETL to load EDW as well as create maintain other EDW structures like views
How to get data out of the EDW • SQL+ • Toad • Sign on to BIPROD • Enter SQL commands (code) to retrieve data
How to get data out of the EDW • SciHealth indicators report EDW data in a dashboard view • UHC O/E Mortality
How to get data out of the EDW • EDW data is uploaded UHC monthly and can be accessed through the Clinical Database (CDB)
How to get data out of the EDW • If you have a data request for your project: • If appropriate for your job you may request access to the EDW OR • Send E-Mail request to Renee Stiles or Martha Newton • Assign work to the CCI data group staff • or forward to Infection Control or PM&I