Incremental Reinforcement Learning With Prioritized Sweeping for Dynamic Environments

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

28 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Pages (from-to)621-632
Journal / PublicationIEEE/ASME Transactions on Mechatronics
Volume24
Issue number2
Online published14 Feb 2019
Publication statusPublished - Apr 2019

Abstract

In this paper, a novel incremental learning algorithm is presented for reinforcement learning (RL) in dynamic environments, where the rewards of state-action pairs may change over time. The proposed incremental RL (IRL) algorithm learns from the dynamic environments without making any assumptions or having any prior knowledge about the ever-changing environment. First, IRL generates a detector-agent to detect the changed part of the environment (drift environment) by executing a virtual RL process. Then, the agent gives priority to the drift environment and its neighbor environment for iteratively updating their state-action value functions using new rewards by dynamic programming. After the prioritized sweeping process, IRL restarts a canonical learning process to obtain a new optimal policy adapting to the new environment. The novelty is that IRL fuses the new information into the existing knowledge system incrementally as well as weakening the conflict between them. The IRL algorithm is compared to two direct approaches and various state-of-the-art transfer learning methods for classical maze navigation problems and an intelligent warehouse with multiple robots. The experimental results verify that IRL can effectively improve the adaptability and efficiency of RL algorithms in dynamic environments.

Research Area(s)

  • Dynamic environments, environment drift, incremental reinforcement learning (IRL), intelligent warehouses, prioritized sweeping