Source Publication
Transportation Research Board (TRB) 104th Annual Meeting 2025
Author ORCID Identifier
Muhammad Adeel: 0000-0002-4097-3953
Asad J. Khattak: 0000-0002-0790-7794
Sabyasachee Mishra: 0000-0002-7198-3548
Diwas Thapa: 0000-0003-4747-6797
Document Type
Conference Proceeding
Publication Date
2025
Abstract
Road work zones (WZs) are increasingly common due to aging infrastructure and the need for capacity enhancement, presenting significant safety risks characterized by narrow lanes, uneven traffic flow, lower speeds, and reduced visibility. This study focuses on understanding the role of human behavioral factors in WZ crash injury severity and addressing the data imbalance caused by the lower incidence of high-cost fatal and serious injuries. A unique dataset comprising 7,855 WZ crashes in Tennessee from 2018 to 2022 was examined. The study applies the Synthetic Minority Over-sampling Technique (SMOTE) combined with a Random Forest (RF) model (a machine learning technique) to balance the dataset and improve prediction accuracy. Results indicate that aggressive driving, overspeeding, and drunk driving significantly escalate injury severity. Additionally, balancing the minority categories of crash injury severity levels (fatal and serious injuries) shifts the importance of contributing factors, emphasizing those more closely associated with higher injury categories. The application of SMOTE proved effective, significantly enhancing the prediction performance across various categories. The accuracy of the RF model improved from 71.88% to 74.36%, while the balanced accuracy increased substantially from 51.58% to 80.97%. These findings offer valuable insights for traffic safety engineers, transportation agencies, and policymakers to enhance WZ design and management. The study provides a framework for analyzing imbalanced crash data, highlighting critical behavioral factors, and recommending additional signage, speed limit reductions, and increased enforcement against unsafe driving behaviors. This approach aims to mitigate injury severity and improve road user safety in work zones.
Recommended Citation
Adeel, Muhammad; Khattak, Asad J.; Mishra, Sabyasachee; and Thapa, Diwas, "To Balance or Not to Balance? Applying a Machine Learning Technique to Oversample Severe Injury Crashes in Work Zones" (2025). Faculty Publications and Other Works -- Civil & Environmental Engineering.
https://trace.tennessee.edu/utk_civipubs/34
Submission Type
Publisher's Version
Peer Review
1