Skip to main navigation Skip to search Skip to main content

Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy

Xiang Zheng, Xingjun Ma, Shengjie Wang, Xinyu Wang, Chao Shen, Cong Wang*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Reinforcement learning agents are susceptible to evasion attacks during deployment. In single-agent environments, these attacks can occur through imperceptible perturbations injected into the inputs of the victim policy network. In multi-agent environments, an attacker can manipulate an adversarial opponent to influence the victim policy's observations indirectly. While adversarial policies offer a promising technique to craft such attacks, current methods are either sample-inefficient due to poor exploration strategies or require extra surrogate model training under the black-box assumption. To address these challenges, in this paper, we propose Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box adversarial policy learning in both single- and multi-agent environments. We formulate four types of adversarial intrinsic regularizers - maximizing the adversarial state coverage, policy coverage, risk, or divergence - to discover potential vulnerabilities of the victim policy in a principled way. We also present a novel bias-reduction method to balance the extrinsic objective and the adversarial intrinsic regularizers adaptively. Our experiments validate the effectiveness of the four types of adversarial intrinsic regularizers and the bias-reduction method in enhancing black-box adversarial policy learning across a variety of environments. Our IMAP successfully evades two types of defense methods, adversarial training and robust regularizer, decreasing the performance of the state-of-the-art robust WocaR-PPO agents by 34%-54% across four single-agent tasks. IMAP also achieves a state-of-the-art attacking success rate of 83.91% in the multi-agent game YouShallNotPass. Our code is available at https://github.com/x-zheng16/IMAP. © 2024 IEEE.
Original languageEnglish
Title of host publicationProceedings - 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2024)
PublisherIEEE
Pages288-301
ISBN (Electronic)979-8-3503-4105-8
DOIs
Publication statusPublished - 2024
Event54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2024) - Brisbane, Australia
Duration: 24 Jun 202427 Jun 2024
https://dsn2024uq.github.io/

Publication series

NameProceedings - Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN

Conference

Conference54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2024)
PlaceAustralia
CityBrisbane
Period24/06/2427/06/24
Internet address

Funding

We thank the anonymous reviewers and our shepherd, Dr. Stjepan Picek, for their helpful and valuable feedback, and Tencent Yunding Laboratory for providing computing resources and generous technical support. This work was partially supported by HK RGC under Grants (CityU 11218322, R6021-20F, R1012-21, RFS2122-1S04, C2004-21G, C1029- 22G, and N CityU139/21), the National Natural Science Foundation of China (U21B2018, 62161160337, 61822309, U20B2049, 61773310, U1736205, 61802166, 62276067), and Shaanxi Province Key Industry Innovation Program (2021ZDLGY01-02).

Research Keywords

  • adversarial policy
  • black-box evasion attack
  • intrinsic motivation
  • Reinforcement learning

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy'. Together they form a unique fingerprint.

Cite this