1 / 32

Analysis of Real-World Data

Analysis of Real-World Data. Static Stability Factor and the Risk of Rollover April 11, 2001. References. Federal Register , June 1, 2000 Description of the original linear regression analysis Federal Register , January 12, 2001 Description of the updated linear regression analysis

ellie
Download Presentation

Analysis of Real-World Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Real-World Data Static Stability Factor and the Risk of Rollover April 11, 2001

  2. References • Federal Register, June 1, 2000 • Description of the original linear regression analysis • Federal Register, January 12, 2001 • Description of the updated linear regression analysis • Comparison with logistic regression analysis

  3. Need to Specify • Vehicles • Calendar years • States • Crash types • Variables • Statistical model

  4. Criteria for Selecting Vehicles • Reliable estimate of the Static Stability Factor (SSF) • Model years 1988 and later • Sources include: • Vehicles tested by the agency • Passenger cars tested by General Motors

  5. Vehicles Selected • 100 vehicle model groups, including: • 36 cars • 30 SUVs • 13 vans • 21 pickup trucks

  6. Criteria for Selecting Calendar Years • Vehicle Identification Numbers (VINs) for that year had been decoded and included in the State Data System (SDS) • Wanted multiple years to maximize data available for analysis

  7. Calendar Years Selected • 1994-1997 for the original linear regression analysis • 1994-1998 for the updated linear regression analysis and the logistic regression analysis

  8. Criteria for Selecting States • Part of the SDS • Provided 1994-1998 calendar year data • Include VIN on the crash file • Identify rollover occurrence even if it is not the first harmful event in the crash

  9. States Selected • Florida • Maryland • Missouri • North Carolina • Pennsylvania • Utah

  10. Other SDS VIN States • VIN available for fatalities only • Kansas • VIN added in 1998 • Georgia • Incomplete rollover information • New Mexico • Ohio

  11. Criteria for Selecting Crashes • Single-vehicle crashes of study vehicles • Excluded crashes with other participants • Pedestrian, pedalcyclist, animal, or train • Excluded certain unusual situations • No driver, parked vehicle, pulling a trailer, or emergency use (ambulance, fire, police, or military)

  12. Crashes Selected • 241,036 single-vehicle crashes, including • 48,996 rollovers • This is 0.20 rollovers per single-vehicle crash, consistent with the national estimate from the General Estimates System for these calendar years and vehicle groups

  13. Criteria for Selecting Variables • Variables describing purpose of study • Rollover (yes or no) • SSF (study values range from 1.00 to 1.53) • Confounding factors • Environmental and driver factors that describe how the vehicle was used • Want variables correlated with rollover risk, including travel speed

  14. Variables Selected • Rollover • SSF • Dichotomous variables based on: • Environmental factors (light condition, weather, urbanization, speed limit, road grade, road curve, road condition, surface condition) • Driver factors (sex, age, insurance coverage, alcohol/drug use) • Number of occupants in the vehicle

  15. Summary of Available Data • Six states • Five calendar years (1994-1998) • 100 vehicle groups with a reliable estimate of SSF • 14 confounding variables, including 10 available in all six states • 241,036 single-vehicle crashes, including • 48,996 rollovers

  16. Limitations • Pennsylvania dropped key road use variables (grade and curve) from its electronic file in 1998, so 1998 Pennsylvania data were not used here • Some variables were not available for all six states (urbanization, road condition, insurance coverage, and number of occupants in vehicle) • Could not be used in analysis of combined data • Were used in logistic analysis of individual states • Reporting practices vary by state

  17. Statistical Models • Linear model of summarized data • Logistic models of individual crashes

  18. Preparing Data for the Linear Model • Limited to state-vehicle groups with at least 25 observations • 518 state-vehicle groups used in analysis • Percentage involvement calculated for each variable, for each state-vehicle group • Values ranged from 0 to 1 • For example: • Rollover risk described by rollovers per single-vehicle crash • Urbanization described by percent of crashes on rural roads

  19. Specifying Linear Model Form • Dependent variable = LOG(rollover risk) • Rollover risk set at 0.0001 for state-vehicle groups with no rollovers so they can be included in model • Five dummy variables used to capture state-to-state differences in reporting practices • Missouri used as baseline case • Linear regression of the rollover variable as a function of the summarized explanatory variables and the state dummy variables

  20. Fitting the Linear Model • Each summary data point was weighted by the sample size, capped at 250 as a trade-off between two considerations • Sample size affects reliability of estimates • Model should fit over entire range of SSF • Stepwise procedure used forward variable selection and a significance level of 0.15 for entry and removal from the model

  21. Results of the Linear Model • Model selected six confounding factors (DARK, FAST, CURVE, MALE, YOUNG, and DRINK) and all five state dummies • R2 = 0.88 for the model of rollover risk as a function of state, road use variables, and SSF • SSF variable coefficient was: • Important in terms of the size of the estimated effect • Highly significant in the model (P<0.0001)

  22. Predictions from the Linear Model • Model describes rollover risk as a function of the explanatory variables and can be used to: • Estimate rollover risk as a function of the SSF for any mix of road-use conditions • Adjust the observed rollover rate for each summary data point to account for differences in vehicle use • Next graph shows results for average conditions observed in the study data as a whole • Rollover risk is estimated as 0.20 in both the adjusted and the unadjusted data

  23. Fit of Linear Model

  24. Interpreting the Linear Model • Estimated rollover risk given a single-vehicle crash is halved when the SSF increases by 0.21 • For example, a vehicle with an SSF of 1.00 has twice the estimated rollover risk of a vehicle with an SSF of 1.21

  25. Specifying Logistic Model Forms • Variables used • Individual explanatory variables or • Scenario risk variable • Approach used with states • Model each state, and average the results or • Model pooled data with dummy variables to capture state-to-state reporting differences

  26. Concept of Scenario Risk • Data divided into cells defined by explanatory variables • For each cell, scenario risk is rollovers per single-vehicle crash • For each crash, scenario risk is adjusted to reflect rollovers per single-vehicle crash for all other crashes in the cell • Idea is to use scenario risk in the logistic model in place of all the explanatory variables

  27. Fitting the Logistic Models • Models from individual states were based on the explanatory variables available in that state • Models from pooled data were limited to the explanatory variables available in all six states

  28. Results of the Logistic Models • The models from the six individual states and the two models based on pooled data all fit the data well • These models were consistent in showing a large and significant effect for SSF

  29. Predictions from the Logistic Models • Logistic models describe the change in the log(odds) of rollover as a function of the change in the SSF • Results can be used to predict the absolute rollover risk as a function of the SSF for a given set of conditions • Here, estimates of average SSF and odds of rollover are based on the data as a whole • The four summary models produce similar results

  30. Comparison of Linearand Logistic Models • Linear and logistic models both suggest SSF has a large effect on rollover risk • Next graph compares results of linear model with results of logistic model from pooled data with individual explanatory variables

  31. Predictions from the Models

  32. Conclusions • Advantages of linear model of summary data • All summary data can be shown • Simpler to explain • Advantages of logistic analysis • Includes full range of values and interactions because not restricted to averages for each vehicle group • Better for measuring effects of explanatory variables because most were significant in the models • In this analysis, logistic analysis appeared to confirm the general pattern of the linear results

More Related