Constrained Intrinsic Motivation for Reinforcement Learning

Xiang Zheng, Xingjun Ma, Chao Shen, Cong Wang*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

This paper investigates two fundamental problems that arise when utilizing Intrinsic Motivation (IM) for reinforcement learning in Reward-Free Pre-Training (RFPT) tasks and Exploration with Intrinsic Motivation (EIM) tasks: 1) how to design an effective intrinsic objective in RFPT tasks, and 2) how to reduce the bias introduced by the intrinsic objective in EIM tasks. Existing IM methods suffer from static skills, limited state coverage, sample inefficiency in RFPT tasks, and suboptimality in EIM tasks. To tackle these problems, we propose Constrained Intrinsic Motivation (CIM) for RFPT and EIM tasks, respectively: 1) CIM for RFPT maximizes the lower bound of the conditional state entropy subject to an alignment constraint on the state encoder network for efficient dynamic and diverse skill discovery and state coverage maximization; 2) CIM for EIM leverages constrained policy optimization to adaptively adjust the coefficient of the intrinsic objective to mitigate the distraction from the intrinsic objective. In various MuJoCo robotics environments, we empirically show that CIM for RFPT greatly surpasses fifteen IM methods for unsupervised skill discovery in terms of skill diversity, state coverage, and fine-tuning performance. Additionally, we showcase the effectiveness of CIM for EIM in redeeming intrinsic rewards when task rewards are exposed from the beginning. Our code is available at https://github.com/x-zheng16/CIM. © 2024 International Joint Conferences on Artificial Intelligence. All rights reserved.
Original languageEnglish
Title of host publicationProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24)
EditorsKate Larson
PublisherInternational Joint Conferences on Artificial Intelligence
Pages5608-5616
ISBN (Electronic)978-1-956792-04-1
DOIs
Publication statusPublished - Aug 2024
Event33rd International Joint Conference on Artificial Intelligence (IJCAI 2024) - International Convention Center Jeju, Jeju Island, Korea, Republic of
Duration: 3 Aug 20249 Aug 2024
https://ijcai24.org

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)
Abbreviated titleIJCAI-24
Country/TerritoryKorea, Republic of
CityJeju Island
Period3/08/249/08/24
Internet address

Bibliographical note

Information for this record is supplemented by the author(s) concerned.

Research Keywords

  • Reinforcement Learning
  • Intrinsic Motivation
  • Unsupervised Skill Discovery

Fingerprint

Dive into the research topics of 'Constrained Intrinsic Motivation for Reinforcement Learning'. Together they form a unique fingerprint.

Cite this