Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies Reginald R. Souleyrette, Robert P. Haas and T. H. Maze Iowa State University, SAIC and Iowa State University ENVIRONMENTAL HEALTH RISK 2007 Fourth International Conference on The Impact of Environmental Factors on Health MALTA; 27 - 29 June, 2007

The highway safety problem Source: World Health Organization

Mitigation approaches – 4Es • Education • Enforcement • Emergency Response • Engineering

Engineering studies • Limited resources • Highest benefit desired • High Crash Locations • Before and After Studies • Small sample size  high variance • Selection bias  regression to the mean (RTM)

Objectives • Validate the state of the art statistical approach, known as empirical Bayesian • Demonstrate tradeoffs between model quality and data quantity • Investigate effect of data aggregation • … to improve identification and therefore mitigation of high crash locations

Statistical approaches we could take… • Use long periods • Use large number of locations • Use Empirical Bayes (EB) • Substitutes “similar” locations for longer observation time • “Weights” site and similar-site data

Mr. Smith • Mr. Smith had no crashes last year • The average of similar drivers is 0.8 crashes per year • What do we expect is the number of crashes Mr. Smith will have next year … 0?, 0.8? … • Answer … use both pieces of information and weight the expectation Hauer, E., D.W. Harwood, F.M. Council, M.S. Griffith, “The Empirical Bayes method for estimating safety: A tutorial.” Transportation Research Record 1784, pp. 126-131. National Academies Press, Washington, D.C.. 2002 http://members.rogers.com/hauer/Pubs/TRBpaper.pdf

Empirical Bayes (EB) • We have two types of information • We compute an estimate which is an average of both • How much to weight the two depends on… • Quantity • Quality • Accepted practice… small scale What should the weight be???

1 ________ 1+(μ∙Y)/φ w = overdispersion factor weight applied to model estimate number of years mean # crashes/year from model Need: - model for similar sites (neg. binomial) Need: site data EB estimate = w∙(model estimate) + (1-w)∙(site average)

2000 2001 2002 2003 2004 2004 Objective #1 Test effectiveness of EB by comparing: • a single year of data from many locations, with different models and the Empirical Bayes formula, vs. • several years of crash data at specific locations

Objective #2 explore the relationship between segmentation and accuracy of estimates

Description of Data Roads (Iowa) • All (19,400km) • Freeways (1400km) • Multilane (8000km) • 2-lane (10,000km) • Low ADT (1200 VPD) • Med ADT (2400 VPD) • High ADT (4400 VPD) • Segments • 400m (short) • 4km (med) • 6.8km (long)

Description of Data Intersections (California) • Multiphase (873) • Single Phase (374) • Thru-stop (3047) • 5 years of data • large-scale validation

Analysis – Intersections Three model forms: • Crashes = α(mainline traffic)β, • Crashes = α(mainline traffic)β(cross street traffic)γ • Crashes = α (mainline traffic)β(cross street lanes)δ Three types of intersections • multiphase signals • Single phase signals • Stop sign control Intersection model parameters and descriptive statistics

Example intersection crash models (only 2 dimensions shown)

Intersection ResultsTop 10 high crash locations in 2003* Highest in 2003 Trying to predict this 4 year average “better” slightly more often than EB Not intuitive EB model “a” lowest error * California HSIS Multiphase 4 leg

Using 4 years of data + EB Now, model “d” never best estimate, but still best model four times? Now, EB better more often

Intersection ResultsEffect on Ranking all models “comparable” EB does slightly better than 4 year average, or 2003 alone

Analysis – Roads • crashes=α(length)(ADT)β • 3 types of roads • Freeway • Multilane divided • 2-lane • 3 segmentations • 0.4, 3.8, and 11.6 km, on average • 3 traffic ranges (L,M,H) • 15 models Road segment model parameters and descriptive statistics

Effect of Segmentation on Correction Freeway-type segments Shortest segments Average length 0.4 km Longest segments Average length 11.6 km Medium segments Average length 3.8 km Note higher EB correction for short segments

Conclusions • EB+1yr ≈ 4yrs of data • Better model did not necessarily improve prediction (at least for the 10 intersections selected) • Longer segment models are more accurate • Intersection 4-year averages and models are relatively poor predictors • But when combined using EB, better

Thank you reg@iastate.edu

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

Presentation Transcript

Image Warping using Empirical Bayes

Validation Studies and the Lawyer

Chapter 11. Highway Traffic Safety: Studies, Statistics, and Programs

Lecture 3 Empirical Bayes and Proc Mixed

Empirical Studies of WTC

Empirical Bayes Analysis of Variance Component Models for Microarray Data

Guidelines for Conducting and Evaluating Empirical Studies

Validation and Implication of Segmentation on Empirical Bayes for Highway Safety Studies

Empirical studies

Neural Network Segmentation and Validation

Improving Highway Safety

Highway Safety Improvement Program/ Strategic Highway Safety Plan

Highway Safety Program

Standing Committee on Highway Traffic Safety

Impact, validation and regulatory implication of rapid methods

AASHTO Vision for Highway Safety

WP7: Empirical Studies

A National Strategy on Highway Safety

§❸Empirical Bayes

Empirical studies on human-in-the-loop?

The Empirical Bayes Method for Safety Estimation Doug Harwood MRIGlobal Kansas City, MO

Highway Safety Syllabus