Projects per year
Abstract
Multi-agent reinforcement learning (MARL) has proven effective in training multi-robot confrontation, such as StarCraft and robot soccer games. However, the current joint action policies utilized in MARL have been unsuccessful in recognizing and preventing actions that often lead to failures on our side. This exacerbates the cooperation dilemma, ultimately resulting in our agents acting independently and being defeated individually by their opponents. To tackle this challenge, we propose a novel joint action policy, referred to as the consensus action policy (CAP). Specifically, CAP records the number of times each joint action has caused our side to fail in the past and computes a cooperation tendency, which is integrated with each agent’s Q-value and Nash bargaining solution to determine a joint action. The cooperation tendency promotes team cooperation by selecting joint actions that have a high tendency of cooperation and avoiding actions that may lead to team failure. Moreover, the proposed CAP policy can be extended to partially observable scenarios by combining it with Deep Q network (DQN) or actor-critic-based methods. We conducted extensive experiments to compare the proposed method with seven existing joint action policies, including four commonly used methods and three state-of-the-art (SOTA) methods, in terms of episode rewards, winning rates, and other metrics. Our results demonstrate that this approach holds great promise for multi-robot confrontation scenarios. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Original language | English |
---|---|
Article number | 30 |
Number of pages | 27 |
Journal | ACM Transactions on Intelligent Systems and Technology |
Volume | 15 |
Issue number | 2 |
Online published | Feb 2024 |
DOIs | |
Publication status | Published - Apr 2024 |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Funding
This work is supported in part by Hong Kong Research Grant Council under GRF 11218621 and RIF project R5060-19
Research Keywords
- Multi-robot confrontation
- Multi-agent reinforcement learning
- Cooperation dilemma
- Consensus action policy
Fingerprint
Dive into the research topics of 'Strengthening Cooperative Consensus in Multi-Robot Confrontation'. Together they form a unique fingerprint.Projects
- 2 Active
-
GRF: Age of Information Centric Task Scheduling in Autonomous Driving Systems
WANG, J. (Principal Investigator / Project Coordinator) & Qiao, C. (Co-Investigator)
1/01/22 → …
Project: Research
-
RIF-ExtU-Lead: Edge Learning: the Enabling Technology for Distributed Big Data Analytics in Cloud-Edge Environment
Guo, S. (Main Project Coordinator [External]) & WANG, J. (Principal Investigator / Project Coordinator)
1/05/20 → …
Project: Research