160 likes | 173 Views
Collection and Use of Industry and Occupation Data II: Overview and Goals. Rosemary D. Cress, DrPH Research Program Director. NAACCR: June 19-25, 2010 Quebec City. Introduction.
E N D
Collection and Use of Industry and Occupation Data II: Overview and Goals Rosemary D. Cress, DrPH Research Program Director NAACCR: June 19-25, 2010 Quebec City
Introduction • The California Cancer Registry has collected information about cases of cancer diagnosed among all California residents since 1988. • In 2008, more than 145,000 Californians were diagnosed with cancer.
Introduction • Information collected for each patient includes: • Patient characteristics (age, sex, race, address, etc) • Tumor characteristics (site, histology, extent of disease) • First course of treatment (surgery, chemotherapy, radiation)
Introduction • Information on industry and occupation is collected in text fields. • The CCR began manually coding narrative I&O in 1995, then developed an autocoding program • Autocoding stopped in 2002 when the CCR moved to a new data management system due to lack of resources
Introduction • In 2007, the Centers for Disease Control funded the National Institute for Occupational Safety and Health (NIOSH) to work with the California Cancer Registry (CCR) on a pilot study to explore the feasibility of coding industry and occupation (I&O) information contained in CCR records.
Goals • Goals of the project: • Develop an autocoding program to assign codes to I&O text fields in registry records • Use these data to calculate incidence rates and standardized rate ratios for major cancer sites for various construction occupations • Demonstrate the usefulness of this coding to other state cancer registries
Methods • During the first year, NIOSH staff translated the older CCR autocoding program to SAS. The program utilizes a “look-up” table with I/O codes and text • CCR staff used the program to evaluate completeness of I/O coding in the CCR database and to identify uncoded text strings
Methods (continued) • Uncoded I/O text strings were prioritized by frequency • CCR staff provided I&O text strings to NIOSH for coding • USC and NIOSH staff coded text strings to the Bureau of the Census 1990 I/O codes • These new codes and text strings were added to the SAS program, and the process was repeated
Results • At the beginning of the study 49.3% of the records in the CCR database had I&O coded. • As of April 2010, 78.4 % of nearly three million records (diagnosed 1988-2007) were coded.
Results • Over 75,000 cancer patients with a construction occupation have been identified. • Over 490,000 patients are coded as “retired,” and over 800,000 as “unemployed or unknown” making their records unusable for I/O research • Over 600,000 remain uncoded
Results • The remaining uncoded cases have unique text strings that will likely take several years to code. • Eventually nearly half of CCR cases will be assigned valid I&O codes (excluding retired)
Conclusions • This study has demonstrated that it is feasible to code I/O text fields for use in research into occupational risk factors for cancer • It also demonstrated a relatively low cost way to obtain I/O information for a large number of cancer patients
Conclusions • However, incomplete information recorded for I/O in the medical record remains a barrier
Future Plans • In the next year we will be working to incorporate the coded I&O data into our research dataset. • We also are exploring how to incorporate I&O information from death certificates into the CCR database.
Acknowledgements • CDC/NIOSH • Geoffrey Calvert • Sara Luckhaupt • Pamela Schumacher • Rui Shen • California Cancer Registry, Sacramento • Katrina Bauer • Mark Allen • Los Angeles Cancer Surveillance Program, USC • Dennis Deapen • Shirley Miyashiro