1 / 24

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

This study validates the use of Empirical Bayesian (EB) statistical approach for identifying high crash locations on highways. It explores the effectiveness of EB in comparison to other models, investigates the impact of data aggregation, and examines the relationship between segmentation and accuracy of estimates.

Download Presentation

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies Reginald R. Souleyrette, Robert P. Haas and T. H. Maze Iowa State University, SAIC and Iowa State University ENVIRONMENTAL HEALTH RISK 2007 Fourth International Conference on The Impact of Environmental Factors on Health MALTA; 27 - 29 June, 2007

  2. The highway safety problem Source: World Health Organization

  3. Mitigation approaches – 4Es • Education • Enforcement • Emergency Response • Engineering

  4. Engineering studies • Limited resources • Highest benefit desired • High Crash Locations • Before and After Studies • Small sample size  high variance • Selection bias  regression to the mean (RTM)

  5. Objectives • Validate the state of the art statistical approach, known as empirical Bayesian • Demonstrate tradeoffs between model quality and data quantity • Investigate effect of data aggregation • … to improve identification and therefore mitigation of high crash locations

  6. Statistical approaches we could take… • Use long periods • Use large number of locations • Use Empirical Bayes (EB) • Substitutes “similar” locations for longer observation time • “Weights” site and similar-site data

  7. Mr. Smith • Mr. Smith had no crashes last year • The average of similar drivers is 0.8 crashes per year • What do we expect is the number of crashes Mr. Smith will have next year … 0?, 0.8? … • Answer … use both pieces of information and weight the expectation Hauer, E., D.W. Harwood, F.M. Council, M.S. Griffith, “The Empirical Bayes method for estimating safety: A tutorial.” Transportation Research Record 1784, pp. 126-131. National Academies Press, Washington, D.C.. 2002 http://members.rogers.com/hauer/Pubs/TRBpaper.pdf

  8. Empirical Bayes (EB) • We have two types of information • We compute an estimate which is an average of both • How much to weight the two depends on… • Quantity • Quality • Accepted practice… small scale What should the weight be???

  9. 1 ________ 1+(μ∙Y)/φ w = overdispersion factor weight applied to model estimate number of years mean # crashes/year from model Need: - model for similar sites (neg. binomial) Need: site data EB estimate = w∙(model estimate) + (1-w)∙(site average)

  10. 2000 2001 2002 2003 2004 2004 Objective #1 Test effectiveness of EB by comparing: • a single year of data from many locations, with different models and the Empirical Bayes formula, vs. • several years of crash data at specific locations

  11. Objective #2 explore the relationship between segmentation and accuracy of estimates

  12. Description of Data Roads (Iowa) • All (19,400km) • Freeways (1400km) • Multilane (8000km) • 2-lane (10,000km) • Low ADT (1200 VPD) • Med ADT (2400 VPD) • High ADT (4400 VPD) • Segments • 400m (short) • 4km (med) • 6.8km (long)

  13. Description of Data Intersections (California) • Multiphase (873) • Single Phase (374) • Thru-stop (3047) • 5 years of data • large-scale validation

  14. Analysis – Intersections Three model forms: • Crashes = α(mainline traffic)β, • Crashes = α(mainline traffic)β(cross street traffic)γ • Crashes = α (mainline traffic)β(cross street lanes)δ Three types of intersections • multiphase signals • Single phase signals • Stop sign control Intersection model parameters and descriptive statistics

  15. Example intersection crash models (only 2 dimensions shown)

  16. Intersection ResultsTop 10 high crash locations in 2003* Highest in 2003 Trying to predict this 4 year average “better” slightly more often than EB Not intuitive EB model “a” lowest error * California HSIS Multiphase 4 leg

  17. Using 4 years of data + EB Now, model “d” never best estimate, but still best model four times? Now, EB better more often

  18. Intersection ResultsEffect on Ranking all models “comparable” EB does slightly better than 4 year average, or 2003 alone

  19. Analysis – Roads • crashes=α(length)(ADT)β • 3 types of roads • Freeway • Multilane divided • 2-lane • 3 segmentations • 0.4, 3.8, and 11.6 km, on average • 3 traffic ranges (L,M,H) • 15 models Road segment model parameters and descriptive statistics

  20. Effect of Segmentation on Correction Freeway-type segments Shortest segments Average length 0.4 km Longest segments Average length 11.6 km Medium segments Average length 3.8 km Note higher EB correction for short segments

  21. Conclusions • EB+1yr ≈ 4yrs of data • Better model did not necessarily improve prediction (at least for the 10 intersections selected) • Longer segment models are more accurate • Intersection 4-year averages and models are relatively poor predictors • But when combined using EB, better

  22. Thank you reg@iastate.edu

More Related