180 likes | 188 Views
This study uses data mining to predict road crash count by analyzing skid resistance values. The research establishes a relationship between road attributes and crash risk to help identify high-risk roads. Data from Queensland's road system is analyzed to develop models and assess road segment crash count stability and characteristics. The results show a strong correlation between actual and predicted crash counts, highlighting the importance of skid resistance in determining crash risk. The models can be used for decision support systems in road safety. Future work includes collaboration with road asset experts to enhance model quality and potential application in other regions.
E N D
USING DATA MINING TO PREDICT ROAD CRASH COUNT WITH A FOCUS ON SKID RESISTANCE VALUES Authors Daniel Emerson;Richi Nayak; QUT Justin Weligamage: QDTMR Presenter Daniel Emerson Computer Science Discipline Queensland University of Technology (QUT)
Project Details • The work for this presentation was conducted as a larger skid resistance – crash analysis as CIEAM I and CIEAM II projects from 2009 -20011 and conducted at QUT. • Project initiators & organizers: Justin Weligamage, Richi Nayak. • Data mining supervisor: Richi Nayak. • Data preparation, data mining & dm strategist : Daniel Emerson • Road engineering advisor: NappadolPiyatrapoomi
Motivation(why the work was done) • Applied data mining as a new approach for analysis with Queensland road & crashes data. • Had found a relationship between the crash risk of roads and their attributes, with skid resistance being significant. (roads having crash). • Sought a higher resolution measure of road crash risk through the crash count method. • Application of crash count data mining models in decision support systems to identifypotential roads for investigation and treatment.
Introduction • This paper presents a data mining case study in which predictive data mining is applied to model the skid resistance & road attributesto predict crashrelationship with the purpose of: • development of models (algorithms) on sample data, • applicationof the models to other data to predict high risk roads.
Data and Data Preprocessing • Several data sources obtained from QDTMR for four year period of 2004 to 2007 include • annual 1 km (or less) road segment snapshots with a list of road variables, • road surface texture depth test readings; seal type and seal age;roadway features, traffic flow, features such as intersections and many others. • dated, skid resistance 100 metre (or less) values representing skid resistance tests F0, • Crash instances, crash details and their road location
Examination of road segment crash count • Meeting our need for a more precise crash measure: crashes per 1km per year.
Crash count characteristics • Road segment crash count showed stability from year to year, indicating its value in crash risk analysis. 1 yr time scale
Clusters: crash count ranges (4yr) • Road segment data mining clusters based on road properties showed characteristic crash counts, thus relating road crash proneness with road properties
Method: Applying predictive data mining Reasons; • To demonstrate that road segment crash count can be modeled, thus establishing a relationship between crash count and roadway features. • Use the rules obtained from the model output in the analytical process to further contribute to understanding of how the roadway features contribute to crash count. • Later apply successful models in decision support.
Method: Applying predictive data mining … using a subset of quality data • Select the target variable to be predicted (crash count). • Select the input variables (road segment attributes). • Select a modelling method (regression tree algorithm). • Run a range of models with varying configurations (regression tree). • Evaluate and understand the results.
Model variables Road attribute input variables (significant order) AVG_FRICTION_AT_60_Ikm (F60 skid resistance) AADT (traffic rates) traffic_percent_heavy lane_count Texture Depth roughness_average rutting_average seal_age seal_type CRASH_SPEED_LIMIT CWAY_TYPE (single, double) CRAS_DIVIDED_ROAD ROAD_TYPE (highway, urban arterial etc) Roadway Feature (roundabouts, bridges, intersections etc) • These road segment attributes were relevant to predicting road segment crash count and became model input variables. Target Variable Road segment crash count
Model results • All models show a high correlation between actual crash count and predicted crash count
Charts of actual value vs. predicted value predicted value • Comparing models with 143 leaves and 83 leaves Actual value
A sample output rule Sample Rule 1. IFAVG_FRICTION_AT_60 < 0.4095 • AND CRASH_SPEED_LIMIT IS ONE OF: 90 100 110 • AND 3987 <= AADT < 6105 • AND CWAY_TYPE EQUALS SINGLE THEN • NODE : 48 • N : 315 …. Number of road segments in the group • AVE : 4.04444 …average crashes for the group • SD : 2.5357 ..standard deviation of the predicted crash values
Conclusion • Road segment crash count can be successfully modelled with road attributes using data mining. • A strong relationship exists between road crash countand road attributes. • Skid resistance plays an important role in determining the crash characteristics of the road segment. • The models may be of sufficient quality to use in decision support. • While the models are specific to Queensland roads, the method can be trialled and evaluated elsewhere.
Future Work • Work with road asset domain experts to analyse the rules, draw conclusions and improve the models. • Apply models for analysis of data subsets, such as crashes with severe human outcomes. • Apply the models to the whole-of-network dataset with the goal of identifying road segments that are skid resistance sensitive, i.e surface intervention to improve skid resistance will result in reduce crash risk.
Acknowledgement • This study is an ongoing investigation into road-crash supported by CIEAM (CRC Asset Management), QDTMR and Faculty of Science and Technology, QUT • Data mining tools used include • SAS (Statistical Analysis Software) • WEKA (Data Mining Software)
Acknowledgement Thanks and Questions Project Publications [1] Nayak, R., Piyatrapoomi, N. and Weligamage, J. (2009). Application of text mining in analysing road crashes for road asset management. Proceedings of the Third World Congress on Engineering Asset Management, WCEAM 2009, ( Athens, Greece, 28-30 September 2009). [2] Nayak, R., Emerson, D., Weligamage, J. and Piyatrapoomi, N.(2010) Using Data Mining on Road Asset Management Data in Analysing Road Crashes. Proceedings of the 16th Annual TMR Engineering & Technology Forum, (Brisbane, July 20, 2010, 2010). [3] Emerson, D., Nayak, R., Weligamage, J. and Piyatrapoomi, N. (2011). Identifying differences in wet and dry road crashes using data mining. (2010). Proceedings of the Fifth World Congress on Engineering Asset Management, WCEAM 2010, ( Brisbane, October 26,2010). [4] Nayak, R., Emerson, D., Weligamage, J. and Piyatrapoomi, N. (2011) Road Crash Proneness Prediction using Data Mining, Proceedings of the EDBT 2011, (Uppsala, Sweden., 2011).