Skip to main navigation Skip to search Skip to main content

Robust Close-Range Air Combat Maneuver Decision-Making Method Based on Opponent Modeling and Reinforcement Learning

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

In the context of one-versus-one close-range air combat involving Unmanned Combat Aerial Vehicles (UCAVs), existing Reinforcement Learning (RL) methods exhibit a significant trade-off between generalization performance and combat efficiency. Training strategies tailored to specific opponents enhance kill rates but suffer from limited generalization performance. Conversely, models approximating Nash Equilibrium through diversified opponent strategies achieve robust generalization but incur reduced efficiency due to training complexity and conservative decision-making. To address this challenge, this paper proposes a robust maneuver decision-making approach based on Opponent Modeling and Reinforcement Learning (OMRL). Grounded in the perspective of the Information Horizon, this approach categorizes opponent strategies into Long-Sighted Strategies, Short-Sighted Strategies, and Fixed Strategies. By employing a Long Short-Term Memory (LSTM) network, OMRL accurately classifies opponent trajectories and leverages the Proximal Policy Optimization algorithm to train targeted solution strategies, thereby constructing an efficient adversarial framework. OMRL dynamically identifies opponent strategy types in real time and invokes corresponding solution strategies, effectively balancing generalization performance and combat efficiency. Experimental results demonstrate that OMRL achieves an average win rate of 0.64 in testing, surpassing other state-of-the-art RL methods. This study represents the first to introduce the concept of Information Horizon-based classification, systematically analyzing the characteristics of various strategies, training an LSTM classifier with trajectory data, and developing the OMRL framework. Through adversarial experiments and ablation studies, the superiority and scalability of OMRL are validated, providing an innovative theoretical and practical foundation for efficient collaborative combat involving UCAVs. © Systems Engineering Society of China and Springer-Verlag GmbH Germany 2025.
Original languageEnglish
Number of pages34
JournalJournal of Systems Science and Systems Engineering
Online published31 Oct 2025
DOIs
Publication statusOnline published - 31 Oct 2025

Funding

This work was partially supported by GRF: CityU 11306821 of Hong Kong SAR Government. The authors would like to thank the anonymous reviewers and editor for their valuable comments and suggestions throughout the review process.

Research Keywords

  • Air combat
  • opponent modeling
  • reinforcement learning
  • self-play
  • nash equilibrium

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Robust Close-Range Air Combat Maneuver Decision-Making Method Based on Opponent Modeling and Reinforcement Learning'. Together they form a unique fingerprint.

Cite this