1 / 32

Matching Data for EHDI Tracking Program

Matching Data for EHDI Tracking Program. Cathy Gunderson EHDI/NEST Project Manager. Colorado Department of Public Health and Environment Denver, Colorado. Faculty Disclosure Information.

raimundo
Download Presentation

Matching Data for EHDI Tracking Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matching Data for EHDI Tracking Program Cathy Gunderson EHDI/NEST Project Manager Colorado Department of Public Health and Environment Denver, Colorado

  2. Faculty Disclosure Information In the past 12 months, I have not had a significant financial interest or other relationship with the manufacturer(s) of the product(s) or provider(s) of the service(s) that will be discussed in my presentation This presentation will (not) include discussion of pharmaceuticals or devices that have not been approved by the FDA or if you will be discussing unapproved or "off-label" uses of pharmaceuticals or devices.

  3. De-Duplicating Person Data in a Centralized Database Cathy Gunderson, Project Manager

  4. Colorado’s Title V Program

  5. Overview • Overview of EHDI/NEST Project • Person De-duplication Process • SOUNDEX • Special considerations • Scoring

  6. Overview of the EHDI/NEST Project

  7. A few FLA’s (Four/Five Letter Acronyms) • CDPHE – Colorado Department of Public Health and Environment • EHDI – Early Hearing Detection and Intervention • NEST – Newborn Evaluation, Screening and Tracking • CHIRP – Clinical Health Information Records of Patients • CSHCN – Children with Special Health Care Needs – In Colorado: HCP – Title V • HCP – Health Care Program for Children with Special Needs (HCP) • CRCSN – Colorado Responds to Children with Special Needs – Birth Defects registry

  8. Project Goals • Develop a comprehensive statewide EHDI program from screening to intervention • Implement a system that has a database that integrates information from NBH with PKU and Sickle Cell Disease • Create and maintain a centralized database which will help prove the efficacy of NBS • Implement the system

  9. Colorado’s Newborn Screening • Colorado screens for 8 conditions • Hemoglobinopathies – Sickle Cell • Inherited Metabolic Diseases: • Phenylketonuria (PKU) • Galactosemia • Biotinidase Deficiency • Cystic Fibrosis • Endocrine Diseases: • Hypothyroidism • Congenital Adrenal Hyperplasia (CAH) • Newborn Hearing • Tandem Mass Spectrometry - 2006

  10. Person De-Duplication

  11. The BIG Picture

  12. First Step–Understand your Data • Electronic Birth Certificate (EBC) • Reported by clerks at birthing hospitals • Reported by clerks at CDPHE for non-birthing hospitals • Reporting is required within 10 days of birth • Laboratory Services Division (LSD) • State Laboratory that processes blood spot screening • Forms mailed to LSD, processed within 24 hours • Results reported within 3 days of receipt • Transactions from other agencies

  13. Data from the Electronic Birth Certificate (Vital Records) • Unique identifier is Birth Certificate Number • Data are not 'cleaned' yet • May be duplicates if hospital sends information more than once • Fields exist for NBH screening results, already associated with the newborn • Newborn information for babies born out of state, born in transit or born at home as well as in birthing hospitals

  14. Data from EBC (cont.) Daily: • EBC processed the night before Weekly: • Infant death records • Voided Records Annually: • Correction tape for resident county

  15. Data from the Newborn Metabolic Screen (State Lab) Daily: • Unique identifier is accession number and form number • Data are final results from each screen • Demographic data on baby and mother • Information on hospital and doctor • Second screen may/may not have original form number (may have been lost) • Second screen may have new doctor

  16. Transaction Data • Input from any CHIRP or CHIRP-like application • Standard Format • Based on different type of event, i.e., birth, Dx, communication, status change • Data sent out from NEST in same format

  17. Second Step – Process the Data Daily: • Validate the data • Validate a unique identifier in input • Must be the same person • Un-duplicate - SOUNDEX routine • Assign unique identifier: NEST_PID • Retain/record original EBC data • Retain/record original lab data • Retain/record original transaction data • Record all screening results (activities)

  18. De-Duplication Routine • If a potential unique number is received: • Verify that it is the same person • If not, it is an exception • Unique Numbers: • SSN – most babies don’t have one yet • EBC • Blood spot form number • Accession number combined with date

  19. SOUNDEX: Find Potential Matches • Find best selections for matching based on SOUNDEX keys • Base on the type of data you receive • Some data better than from other sources • Reliability of data coming in • EBC considered ‘most right’

  20. Build a SOUNDEX Key for Input • Treat as all lower case • If first two letters are ei or ai change to i • Change all c to k • Change all ch to k • Change all ph to f • Change all z to s • Change all y to i

  21. Remove all duplicated letters • Remove all special characters (‘.-. And spaces) • Keep first letter of each name part • Remove all vowels after 1st letter • Use first 4 remaining letters for last name • Use first 3 remaining letters for first name • Use middle initial • Put DOB in CCYYMMDD order

  22. Special Considerations • Hispanic Surnames • Can be a composite: • Father’s last name • Mother’s last name • A composite name will be treated as 3 last names • Marital Status or Insurance restrictions • LAB under Mother’s last name at birth • Unmarried mom • Married mom but insurance still under maiden name • EBC under Father’s last name

  23. Added Considerations • SOUNDEX might select a candidate, but no score for a match on actual data • Lopes Lopez • Gonzalez Gonzales • Gomez Gomes • We allow for points on a SOUNDEX match

  24. Example: SOUNDEX Routine

  25. SOUNDEX Key Types • 5 Key Types: • A) LastName FirstName MiddleInit Gender DOB (YYMM) • B) LastName Gender DOB (YYMMDD) TOB (HHMM) • C) LastName LastName FirstName DOB (YYMM) • D) LastName SOUNDEX • E) FirstName SOUNDEX

  26. SOUNDEX Keys • UP to 24 Different Values for the 5 Types • Example: LastName • Child’s Last Name • Child’s AKA Last Name • Child’s Last Name Part 1 • Child’s Last Name Part 2 • Mother’s Last Name • Mother’s Last Name Part 1 • Mother’s Last Name Part 2 • Father’s Last Name • Father’s Last Name Part 1 • Father’s Last Name Part 2

  27. Average Number of Keys • No child has all 24 keys • If Child and Mom and Dad all have same last name, Key is only created once – no duplicate SOUNDEX Keys for a child • Missing Data • On average, each of our children have 12 keys.

  28. Scoring Routine • After a potential match is found, individual fields are compared and points awarded for matches • Actual Data Fields are compared : • Last name, first name, middle name, DOB, TOB • Mother’s last name, first name, maiden name, DOB • AKA names • Father’s last name, first name, DOB • Any unique identifiers recorded with the input and on the database (i.e., Birth Cert #, NBS form #, etc.)

  29. Scoring Routine (cont.) • A score above a certain threshold indicates the same person – same NEST_PID assigned • A score under a certain threshold indicates a different person – new NEST_PID is generated • A score between those thresholds cannot be determined by the application and will need human intervention to determine

  30. Fine Tuning the De-duplication Routine • Make adjustments to the thresholds • Too many duplicates being added • Make adjustments to the points awarded for matches • Twins! Use birth type and order • ? Take away points for no matches ? • What if MI present on one and not on another – some points? • Constant vigilance!

  31. Human Intervention • Can help fine tune the De-duplication Routine • Three options: • Override: • Add as a new person • Indicate a match and update information • Resubmit • Thread of processing / timing / Twins!

  32. Colorado Contact Cathy Gunderson, EHDI/NEST Project Manager Colorado Department of Public Health and Environment FCHSD-HCP-A4 4300 Cherry Creek Drive South Denver, CO 80246-1530 cathy.gunderson@state.co.us 303-692-2145

More Related