Linear Dynamics-embedded Neural Networks Design based on State Space Models and Observer Theory

Student thesis: Doctoral Thesis

Abstract

With the rapid development of sensor technologies such as cameras, remote sensing, and distributed sensing systems, the volume and complexity of data collected across diverse domains have increased dramatically. Effectively processing such data, exploring underlying patterns, and accurately modeling system dynamics have become key challenges in both academic research and industrial applications.

In recent years, deep learning has shown remarkable success in a wide range of domains, including computer vision, natural language processing, and speech recognition, greatly advancing the field of data-driven modeling and prediction. However, as tasks grow more complex and data structures become increasingly high-dimensional and nonlinear, existing methods exhibit notable limitations. In particular, when it comes to modeling and predicting sequential and spatiotemporal data governed by complex and nonlinear dynamics, modern deep learning architectures still face several fundamental theoretical and practical challenges:

•Existing neural network frameworks often struggle with long-sequence modeling, especially in terms of maintaining a balance between computational efficiency and predictive accuracy during training and inference. Designing models that are simultaneously lightweight, scalable, and accurate remains an open problem.

•Many real-world systems exhibit time-varying and highly nonlinear dynamics, which are difficult to model with fixed architectures or static parameterizations. Accurately capturing such dynamic behaviors, while ensuring tractability and computational efficiency, is an ongoing challenge.

•The difficulty further increases when addressing spatiotemporal phenomena involving high-dimensional data. Traditional low-dimensional models often lack theoretical generalization, and most deep learning approaches rely heavily on empirical heuristics without formal guarantees. Thus, developing theoretically grounded neural architectures with strong modeling capacity is of both practical importance and research significance.

This thesis addresses these challenges by exploring the integration of domain knowledge from dynamical system into neural network design. Specifically, this work draws from state space models and observer theory in control systems to construct linear dynamics-embedded neural networks for sequence and spatiotemporal data modeling. State space models provide a structured representation of dynamic systems, particularly in their linear form, while observers offer systematic approaches for estimating hidden states in these systems. By embedding linear dynamics and related principles into neural network architectures, we aim to improve performance, and interpretability in sequential and spatiotemporal learning tasks.

The main contributions of this thesis are as follows:
•Efficient State Space Models: An efficient neural network is designed based on linear time-invariant state space models. By leveraging the structure of linear dynamics, several strategies are proposed to enhance computational efficiency. The resulting model captures stationary system behavior with a compact parameterization, enabling efficient training and inference with high accuracy on long-sequence tasks such as speech and text modeling.

•Dynamic State Space Models: By extending the concept of state space modeling to systems with time-varying dynamics, linear time-varying dynamics are incorporated into neural network design. This integration enables the model to adapt to temporal variations in complex system behaviors. By embedding time-dependent linear dynamics into the network’s layer structure, the proposed architecture achieves more flexible and accurate predictions for nonstationary sequences.

•Spatiotemporal Observer: Inspired by the classical Kazantzis-Kravaris- Luenberger observers, an extended spatiotemporal observer is proposed for estimating hidden states in high-dimensional systems with spatiotemporal dependencies. By theoretically transforming nonlinear spatiotemporal dynamics into a linear dynamics-embedded framework, the proposed method offers guarantees on predictive convergence and generalization. The framework demonstrates strong performance across a wide range of spatiotemporal modeling tasks.

All the above contributions fall under the broader framework of linear dynamics-embedded neural networks, which provide innovative design pathways for modern neural architectures. The proposed methods are empirically validated across a range of benchmarks, including both synthetic and real-world datasets. Experimental results demonstrate that linear dynamics-embedded neural networks not only improve prediction accuracy but also significantly enhance computational efficiency compared to conventional deep learning models.

Finally, limitations of the current research and potential future directions are discussed, including the extension of linear dynamics embedding to more complex nonlinear physical systems, and the integration of broader domain knowledge to further advance neural network design.
Date of Award9 Jul 2025
Original languageEnglish
Awarding Institution
  • City University of Hong Kong
SupervisorHanxiong LI (Supervisor)

Cite this

'