Skip to main navigation Skip to search Skip to main content

Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

In many reinforcement learning tasks, the agent has to learn to interact with many objects of different types and generalize to unseen combinations and numbers of objects. Often a task is a composition of previously learned tasks (e.g. block stacking). These are examples of compositional generalization, in which we compose object-centric representations to solve complex tasks. Recent works have shown the benefits of object-factored representations and hierarchical abstractions for improving sample efficiency in these settings. On the other hand, these methods do not fully exploit the benefits of factorization in terms of object attributes. In this paper, we address this opportunity and introduce the Dynamic Attribute FacTored RL (DAFT-RL) framework. In DAFT-RL, we leverage object-centric representation learning to extract objects from visual inputs. We learn to classify them into classes and infer their latent parameters. For each class of object, we learn a class template graph that describes how the dynamics and reward of an object of this class factorize according to its attributes. We also learn an interaction pattern graph that describes how objects of different classes interact with each other at the attribute level. Through these graphs and a dynamic interaction graph that models the interactions between objects, we can learn a policy that can then be directly applied in a new environment by estimating the interactions and latent parameters. We evaluate DAFT-RL in three benchmark datasets and show our framework outperforms the state-of-the-art in generalizing across unseen objects with varying attributes and latent parameters, as well as in the composition of previously learned tasks. © 2023 Neural information processing systems foundation. All rights reserved.
Original languageEnglish
Title of host publication37th Conference on Neural Information Processing Systems (NeurIPS 2023)
EditorsA. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, S. Levine
Pages19117-19144
ISBN (Electronic)9781713899921
Publication statusPublished - Dec 2023
Event37th Conference on Neural Information Processing Systems (NeurIPS 2023) - New Orleans Ernest N. Morial Convention Center, New Orleans, United States
Duration: 10 Dec 202316 Dec 2023
https://papers.nips.cc/paper_files/paper/2023
https://nips.cc/Conferences/2023

Publication series

NameAdvances in Neural Information Processing Systems
Volume36
ISSN (Print)1049-5258

Conference

Conference37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Abbreviated titleNIPS '23
PlaceUnited States
CityNew Orleans
Period10/12/2316/12/23
Internet address

Bibliographical note

Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).

Fingerprint

Dive into the research topics of 'Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning'. Together they form a unique fingerprint.

Cite this