People-centric Methods for Transportation Information Analysis and System Evaluation Using Social Media Data


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date5 May 2022


During the past two decades, the urban transportation system has developed rapidly to keep pace with the increasing population and travel demand. Besides improving travel efficiency, the developed systems link citizens with various social services and opportunities. Nevertheless, other problems, such as traffic noise and congestion brought by poor traffic control, also negatively influence people's living experience. Since intangible human life aspects (e.g., happiness and health) are becoming essential evaluation metrics of urban planning, transportation authorities should consider people-centric factors in future transportation management to ensure sustainable urban growth.

However, few systematic modules consider the people-oriented information in urban transportation information analysis and transportation system evaluation. Recently, with the increasing popularity of social media platforms, the time, text, and location information extracted from user-generated social media data offers researchers a new perspective to analyze transportation-related information and guide future transportation control and planning. Meanwhile, it provides a new perspective to understand the influence of the transport system on people. Hence, why not use social media data analysis to enhance the current urban transportation management?

Given the problems and opportunities mentioned above, this thesis aims at developing people-centric methods in transportation information analysis and transportation system evaluation using social media data. More specifically, the thesis proposes modules to 1) detect and analyze the accident- and congestion-related Chinese microblogs with location information, 2) identify and characterize the accident- and congestion-prone areas heatedly discussed on Sina Weibo in Shanghai, 3) evaluate the influence of new transit stations on the local neighborhoods of Hong Kong using Twitter data. Compared with the previous studies, the proposed methods, which involve people in the transportation management process and incorporate people-generated social media data into transportation information analysis and system evaluation, offer insights into future urban transportation system development.

The first work builds a deep learning model to detect the accident- and congestion-related Chinese microblogs containing location information. The key feature of this proposed model is its ability to identify the Chinese microblogs containing not only a description of a traffic accident or congestion but also event location, which is more convenient for the downstream traffic-related spatial analysis. Moreover, it takes the repost behavior, a unique feature in social networks, into account and extracts the microblogs ever reposted the accident- or congestion-related microblogs. Extensive experiments and spatio-temporal analyses show the effectiveness of the proposed module in the urban traffic-relevant analysis.

The second work proposes a module to find and profile the accident- and congestion-prone areas in Shanghai by analyzing the accident- and congestion-related Chinese microblogs. A modified Kernel Density Estimation (KDE) method is applied to highlight regions with the relatively high-density accident and congestion microblogs, respectively. The results show that the "congestion-prone areas" expressed on social media are mainly distributed throughout the historical urban core and the Northwest of Pudong New District, whereas the "accident-prone areas" are found in areas with severe accidents. The identified accident- and congestion-prone areas are then characterized in spatial, temporal, and semantic aspects to understand the nature of the incidents and assess the priority level for mitigation measures. The outcome can inform resource allocation and prioritize mitigation measures.

The last work develops a method to evaluate the influence of six new transit stations on local neighborhoods of Hong Kong using Twitter data. Tweet sentiment and tweet activity of the studied neighborhoods are used as proxies to estimate the social media sentiment and people's willingness to visit. The Difference-in-difference (DID) model is applied to measure the causal relationship between the introduction of transit stations, tweet sentiment, and tweet activity. Then text, tweet sentiment, and footprint comparison between two types of Twitter users are conducted to check the influence of new transit stations on people. The results suggest that, in general, the introduction of transit stations causes a positive change in tweet activity and tweet sentiment. Nevertheless, only the change in tweet activity is statistically significant. Moreover, new transit stations tend to positively influence the tweet sentiment of local areas with high-density residential places or recreational activities. The introduction of new transit stations increases the accessibility, inferred by the expanded locations sustaining station-influenced users visited. The results of this work can help urban planners understand the local impact of urban transit development on people directly.

In general, the thesis fills the research gap in urban transportation information analysis and transportation system evaluation by developing people-centric analytical modules based on social media data. With the advent of big data analytics and the call for more people-oriented urban planning, the developed methods can provide insights and policy recommendations for future urban transportation management.