Skip to main navigation Skip to search Skip to main content

Flow Regulation Techniques for Enhanced Network Efficiency and Resilience

Student thesis: Doctoral Thesis

Abstract

Modern networks increasingly serve delay-sensitive applications—from industrial control systems to distributed machine learning—where even minor latency fluctuations degrade performance, safety, or efficiency. Existing solutions, however, struggle to balance stringent timing requirements with scalability and adaptability. This thesis bridges this gap through three interconnected contributions that address latency-critical challenges across domains: (1) enhancing schedulability in time-sensitive networks, (2) mitigating cascading delays in safety-critical systems, and (3) minimizing synchronization induced congestion in high-throughput computing. By designing novel methods to regulate traffic flows dynamically, we enhance network performance across deterministic, safety-critical, and AI-driven environments.

First, we challenge the flow isolation paradigm in Time-Sensitive Networking (TSN) by introducing a jitter-aware scheduling framework that allows controlled interference between time-sensitive flows. Unlike existing methods that rely on worst-case latency analysis, our approach quantifies permissible jitter bounds using a constant-complexity model, tracking earliest and latest frame reception times across nodes. Coupled with a discrete time reference mechanism and workload-shifting strategy, the framework schedules 3 ~ 124% more flows than state-of-the-art heuristics while reducing scheduling runtime by 98.44%, enabling scalable TSN deployment in industrial IoT without sacrificing determinism. Second, we develop dynamic flow filtering and policing to mitigate cascading delays in safety-critical systems. Our resilience-oriented framework, RobustTSN, employs IEEE 802.1Qci's PSFP to regulate flows based on local-safe delays and global-safe intervals. By formulating local-safe delays and global-safe intervals, we dynamically admit non-critical delayed frames while preventing congestion cascades. By dynamically admitting non-critical traffic while blocking disruptive delays, it reduces deadline violations by 61.8% in networks under attacks, ensuring real-time performance for delay-sensitive applications. Third, we propose a proactive flow desynchronization technique for machine learning clusters, where synchronized traffic bursts significantly prolong training times. Our solution, DeSync, regulates Remote Direct Memory Access (RDMA) flows by injecting optimized randomized delays calculated via network calculus, effectively dispersing congestion without hardware modifications. This approach enhances bandwidth utilization and network efficiency, reducing flow completion times by 10.5% and providing a practical solution for RDMA congestion control in data centers. Together, these contributions demonstrate how targeted flow regulation techniques can enhance efficiency and resilience across diverse networks. By unifying jitter-aware scheduling, dynamic filtering, and proactive desynchronization, this work provides deployable solutions for industrial and AI infrastructure.
Date of Award21 Oct 2025
Original languageEnglish
Awarding Institution
  • City University of Hong Kong
SupervisorJianping WANG (Supervisor)

Cite this

'