Domain Knowledge Guided Feature Engineering Methods in Intelligent Transportation Systems


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date16 Apr 2021


Over the past decades, intelligent transportation systems (ITS) have emerged and developed to enhance transportation safety, increase productivity and mobility, reduce fuel consumption and environmental impact, and promote sustainable transportation development. With the advancement of detection techniques and sensors, a massive amount of real-time transportation data are collected. Feature engineering on sensor data has become an important component of machine learning applications in the ITS field. However, few of the current studies performed feature engineering by considering the transportation domain knowledge, which results in lack of interpretability for those models. Besides, most of the current research conducted feature engineering on raw data directly, which are high-dimensional and contain redundant information, and then built sophisticated models to increase the performance. In such circumstance, the models are easier to be overfitting, with poor universality and applicability.

This thesis aims at developing some tailored domain-knowledge-guided feature engineering methods to process different sensor data, to solve emerging problems in the field of ITS. Specifically, three studies are included in the thesis, and each focuses on a specific domain-knowledge-guided feature engineering issue. The first two studies conduct feature engineering by tapping into the aviation domain expertise, while the last one performs feature engineering on vibration data based on the knowledge of vehicle dynamics in rail transport.

The first work develops a novel data-driven model for the fast assessment of terminal airspace redesigns regarding system-level fuel burn. When given a terminal airspace design, the fuel consumption model calculates the fleet-wide fuel burn based on the departure/arrival profiles as specified in the design. Then, different airspace designs can be compared and optimized regarding their impact on fuel burn. The fuel consumption model is developed based on the Multilayer Perceptron Neural Network(MLPNN). The model is trained and evaluated using the Digital Flight Data Recorder (FDR) data from real operations. The feature engineering for modeling is conducted by considering the aviation domain knowledge, which includes the underlying physics of aircraft and engine operations. We demonstrate the proposed MLPNN method via a case study of Hong Kong terminal airspace and the results show that the proposed model is an effective tool to support fast evaluation of airspace designs focusing on fuel burn.

The second work proposes a novel model based on MLPNN to generate realistic aircraft trajectories for terminal airspace redesign. We take a transfer learning approach, training a model on existing standard routes yet applying the model to generate trajectories of new standard routes. The key enabler of this transfer learning is a novel input-and-output construction method for trajectory modeling in terminal airspace, including trajectory reconstruction, feature engineering, and output designing. With regard to feature engineering, we extract five key features from the reconstructed trajectory data based on domain knowledge of air traffic control and flight operations. The proposed model is tested on real-world operational data for its accuracy and practicability. Results demonstrate that the model can quantify the characteristics of aircraft trajectories that are transferable between standard routes, and generate realistic trajectories for new standard routes.

The third work builds a novel data-driven framework to monitor the health status of the high-speed rail suspension system by measuring train vibrations. Based on multioutput support vector regression (MSVR), the proposed framework can monitor the stiffness and damping coefficients of the suspension system using vibration signals measured on trains in real-time. The feature engineering is based on a simple dynamics model to select the relevant information in the multi-location vibration data. The proposed framework is evaluated on simulation data for its accuracy and tested on realworld operational data for its practicability.

By proposing several novel data-driven approaches in the ITS area, the thesis aims to bridge existing gaps in the research of domain-knowledge-guided feature engineering on sensor data. These studies in this thesis can provide insights into the implementation of effective and specialized feature engineering for different practical issues in the ITS field.