210 likes | 234 Views
Session 38 Alcohol Imputation Model. Why Impute? Joseph (Joe) M. Tessmer Mathematical Analysis Division. Why?. There is a problem in Alcohol Reporting. Wide range of BAC reporting of drivers and non-occupants by states.
E N D
Session 38Alcohol Imputation Model Why Impute? Joseph (Joe) M. Tessmer Mathematical Analysis Division
Wide range of BAC reporting of drivers and non-occupants by states • Levels of reporting alcohol test results for drivers and non-occupants involved in fatal crashes ranged by states from: Less than 12 Percent to More than 86 Percent
Why Impute ? • Reduce Potential Biases in Estimates • 14 % to 88 % of the BAC test results are missing in FARS – dependent on the state • Nationally, approximately 60 % of BAC data are missing for drivers and non-occupants
Why Impute ? • If the individuals selected for BAC level testing is not a random sample, the estimates will be biased • Often only drivers suspected of a high BAC are tested • We would over-estimate BAC levels • 44 % of tested individuals had BAC > 0
Current Imputation Procedure • 3 Level discriminant analysis • For each missing BAC number calculate the probability that • 1) BAC = 0 • 2) 0 < BAC < 0.1 • 3) 0.1 <= BAC • Note the probabilities add to 1
Current Imputation Procedure • Provides some useful information • It was a major step forward when introduced in 1986 • It is a rigid procedure and can not be used to quantify the effect of the current 0.08 BAC legislation
Current Imputation Procedure • Results can not be used as input to other types of analysis • Can not be used as an independent variable in crash analysis • Can be used as a weight
Why Change to Multiple Imputation ? • State of the art solution • Imputed values are actual BAC levels which can be used in additional analysis.
Why Change to Multiple Imputation ? • Improve fidelity of results • Permits analysis at any level of BAC • Old technique uses the probability that value falls within one of three ranges [More difficult to use.]
Why Change to Multiple Imputation ? • Can calculate the standard error of the estimates. • Achieve greater confidence in results (narrower confidence limits)
Example 1 • Driver Characteristics • Female Driver • 36 years old • Seat belt used • Crash Characteristics • 8:20 a.m. Tuesday in October • 3 passengers all children in vehicle • 2 Vehicles involved • Police reported no drinking and no BAC data • Estimated BAC = 0.0
Example 2 • Driver Characteristics • Male Driver • 23 years old • Seatbelt not used • Crash Characteristics • 2:10 a.m. Saturday in July • No passengers • Single vehicle crash • Police reported drinking but no BAC data Estimated BAC = 0.14
Example 3 • Driver Characteristics • Male Driver • 23 years old • Seatbelt not used • Crash Characteristics • 2:10 a.m. Saturday in July • No passengers • Single vehicle crash • Police reported no drinking and no BAC data • Estimated BAC = 0.00
Multiple Imputation is a lot of work. . . • Uses several characteristics of the crash and of the driver or non-occupant to estimate 10 BAC levels for each case w/missing BAC. • Does it work? • Are the numbers right?
Verification Test • Select a year of FARS data • Restrict data to known BAC data • Randomly recoded 25 % of known BAC “data missing”
Verification Test • Impute the “missing data”, based on 75% remaining data • Compare estimates vs. the actual data • Repeat test.