Interpretable Machine Learning for Understanding and Forecasting Tropical Cyclones Intensity on Different Time Scales


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
  • Lin ZHANG (Supervisor)
  • Wen ZHOU (External person) (External Co-Supervisor)
Award date27 Oct 2023


Tropical cyclones (TCs) have shown a growing threaten to coastal regions with an increasing trend in intensity and a coastalward migration. This study aims to explore the potential of interpretable machine learning (ML) techniques in forecasting and analyzing TC intensity on different time scales.

Firstly, on the interdecadal and interannual time scales, the pHash+Kmeans clustering algorithm is designed to cluster TCs over the western North Pacific (WNP) based on the genesis environments. The clustering results suggest the interdecadal variation in WNP TCs, including an abrupt decrease in the eastern WNP after 1997 and an increasing trend in the South China Sea after 2010. The clustering results also indicate that after 1997, intense TCs show little change in frequency but a coastalward migration, and RI events shift 5.5° westward (4.5° northward) in El Niño (La Niña) years.

Secondly, on the monthly scale, the Convolutional long short-term memory (ConvLSTM) model is built to predict the sea surface and subsurface temperature fields over the Pacific. Future WNP thermodynamic conditions and the NINO 3.4 index are obtained from the prediction results to determine future WNP TC activities.

Finally, on daily and hourly time scales, the Vision Transformer architecture is applied to extract TC intensity from satellite images and forecast the TC intensity change based on the environmental conditions. The attention map analysis shows the positive correlation between cloud patterns and TC intensity, and the environmental contribution to TC intensity change.

In conclusion, this study proposes an interpretable ML system for understanding and forecasting TC intensity on different time scales, including a pHash+Kmeans cluster algorithm for analyzing long-term variation, a ConvLSTM model for monthly prediction, and ViT-based models for TC intensity estimation and forecasting.