Traffic Crash Detection and Traffic Accident Analysis by Data Mining Methods


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date10 Aug 2020


Road traffic crashes are considered globally as a public safety problem. They cause great casualties, economic losses, and traffic congestion each year. The World Health Organization (WHO) reported that over 1.35 million people die, and 50 million people are injured each year due to road traffic crashes. The economic cost of road crashes was estimated at 3% of GDP globally. Although traffic crashes are a leading cause of death globally, most traffic crashes are predictable and preventable by effective precautionary measures. Therefore, analyzing traffic crashes and proposing effective measures are important to improve road safety.

This thesis proposes a framework based on data mining methods to reduce traffic crashes and improve road safety. The framework consists of two steps. The first step is traffic crash detection, which aims to predict “when and where a traffic crash will occur”. Therefore, proactive prevention measures can be implemented to prevent traffic crashes. The second step is traffic accident analysis, which aims to explore the key factors associated with accident severity and identify the hot spots of traffic accidents. Thus, effective measures can be proposed to reduce accident severity and improve road safety. Data mining is adopted as the fundamental methodology to identify “hidden knowledge” from traffic accidents.

The first step (i.e., traffic crash detection) is to establish an accurate detection model to predict crashes. Some studies have been conducted to develop detection models with data mining methods. However, most existing detection models were established on traditional machine learning methods and their detection performance was limited. Deep learning methods are state-of-the-art techniques to address prediction problems. Few studies have applied deep learning methods for crash detection, which have been validated to perform well in other traffic domains. Therefore, this thesis proposes a novel framework based on deep learning methods for crash detection. The framework aims to explore the application of deep learning methods for traffic crash detection to improve prediction performance.

The second step (i.e., traffic accident analysis) is to explore the key factors related to accident severity and propose measures to improve road safety. Association rule mining (ARM) is an efficient data mining method to identify key factors associated with injury severity. However, existing studies have the following limitations in accident severity analysis: (1) Datasets of traffic accidents have imbalance problems. Data mining methods on imbalanced datasets make the results biased. (2) Existing studies seldom conducted a spatial analysis of traffic accidents, which can provide intuitive suggestions for policymakers. (3) Most studies determined parameter thresholds of ARM by decision-makers. Decision-makers are required to have great expert knowledge in ARM and the traffic domain. (4) Existing studies mainly analyzed the two-item rules of ARM to identify individual key factors related to accident severity. However, research on multiple-item rules to identify more influential factors is lacking. Therefore, this thesis proposes two frameworks to address the above research gaps and limitations. The main purpose is to identify more factors related to accident severity and improve the robustness, reliability, objectiveness, and intuitiveness of the results. Thus, effective measures can be proposed to reduce accident severity and improve road safety.

This study evaluates the performance of the proposed frameworks by applying the frameworks to case studies for traffic crash detection and traffic accident analysis. For traffic crash detection, a novel deep learning-based framework is implemented on the datasets of I880-N and I805-N in California, America to improve prediction performance. Results indicate that the proposed framework can obtain satisfactory performance on crash detection and model transferability, with the highest crash accuracy of 70.43% and 65.12% separately. The proposed framework can support real-time crash detection with a calculation time of 0.92 seconds. This framework has been validated to perform better than most models established on machine learning methods. For traffic accident analysis, two novel ARM-based frameworks are separately applied on datasets of run-off-road (ROR) accidents and motorcycle accidents in Victoria, Australia to improve accident severity analysis. Six individual key factors are identified to be closely associated with fatal ROR accidents. Five individual key factors and four boosting factors are explored to be related to fatal injuries in motorcycle accidents. The hot spots of traffic accidents related to fatal factors are presented in geographic information system (GIS) maps. Policymakers can refer to those maps straightforwardly when making decisions. Results show that the proposed two ARM-based frameworks can effectively address the limitations in traffic accident analysis (e.g., imbalanced datasets, spatial analysis, parameter optimization of ARM, and multiple-item rule analysis of ARM). The two ARM-based frameworks have been validated to perform better on robustness, reliability, efficiency, and objectiveness for traffic accident analysis.

    Research areas

  • Traffic crash detection, Traffic accident analysis, Data mining, Deep learning, Association rule mining, Imbalance problem, Parameter optimization, Rule analysis, Geographic information system