Skip to main navigation Skip to search Skip to main content

Intrinsic gradient oxygen-driven second-order memristors for continual reinforcement learning

  • Jianyu Ming
  • , Ruiheng Wang
  • , Jingwei Fu
  • , Jing Liu
  • , Siqi Wu
  • , Shanshuo Liu
  • , He Shao
  • , Yanfei Li
  • , Jin Wang
  • , Yiru Wang
  • , Xumeng Zhang
  • , Linghai Xie*
  • , Haifeng Ling*
  • , Wei Huang*
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Downloads (CityUHK Scholars)

Abstract

The intrinsic gradient in organisms governs long-term information processing and continual learning in dynamic environments. Reinforcement learning (RL) aims to emulate such temporal regulation to enhance learning efficiency and adaptability. However, the absence of intrinsic gradient construction in existing memristive devices leads to stochastic and abrupt state changes, disrupting the generation of temporally correlated internal states critical for continual RL. In this work, a second-order memristor incorporating a stable intrinsic oxygen gradient was designed via a molecular-coordinated layer, which enables a prolonged dynamic barrier evolution (>102s). This slow dynamic response facilitates balanced oxygen ion migration and diffusion under unipolar spike stimulation, resulting in a significant conductance modulation (ΔG = –98.1%). These temporally adaptive conductance states were quantitatively mapped to learning rates in the RL algorithm, allowing the learning task timescale to co-evolve with device dynamics. Compared with conventional strategies, intrinsic-gradient-driven modulation reduced training iterations by 68.75% and 35.65% in static and dynamic environments, respectively. These findings underscore the potential of slow-dynamic second-order memristors as physically grounded time-adaptive units bridging device dynamics and algorithmic learning in neuromorphic systems. © The Author(s) 2026.
Original languageEnglish
Article number3367
Number of pages11
JournalNature Communications
Volume17
Online published3 Mar 2026
DOIs
Publication statusPublished - 2026

Funding

The project was supported by the National Natural Science Foundation of China (Nos. 62471251, 62288102, 12204248, 22209023), Basic Research Program of Jiangsu (BK20240033, BK20243057), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. SJCX21_0252).

Publisher's Copyright Statement

  • This full text is made available under CC-BY-NC-ND 4.0. https://creativecommons.org/licenses/by-nc-nd/4.0/

Fingerprint

Dive into the research topics of 'Intrinsic gradient oxygen-driven second-order memristors for continual reinforcement learning'. Together they form a unique fingerprint.

Cite this