Abstract
The intrinsic gradient in organisms governs long-term information processing and continual learning in dynamic environments. Reinforcement learning (RL) aims to emulate such temporal regulation to enhance learning efficiency and adaptability. However, the absence of intrinsic gradient construction in existing memristive devices leads to stochastic and abrupt state changes, disrupting the generation of temporally correlated internal states critical for continual RL. In this work, a second-order memristor incorporating a stable intrinsic oxygen gradient was designed via a molecular-coordinated layer, which enables a prolonged dynamic barrier evolution (>102s). This slow dynamic response facilitates balanced oxygen ion migration and diffusion under unipolar spike stimulation, resulting in a significant conductance modulation (ΔG = –98.1%). These temporally adaptive conductance states were quantitatively mapped to learning rates in the RL algorithm, allowing the learning task timescale to co-evolve with device dynamics. Compared with conventional strategies, intrinsic-gradient-driven modulation reduced training iterations by 68.75% and 35.65% in static and dynamic environments, respectively. These findings underscore the potential of slow-dynamic second-order memristors as physically grounded time-adaptive units bridging device dynamics and algorithmic learning in neuromorphic systems. © The Author(s) 2026.
| Original language | English |
|---|---|
| Article number | 3367 |
| Number of pages | 11 |
| Journal | Nature Communications |
| Volume | 17 |
| Online published | 3 Mar 2026 |
| DOIs | |
| Publication status | Published - 2026 |
Funding
The project was supported by the National Natural Science Foundation of China (Nos. 62471251, 62288102, 12204248, 22209023), Basic Research Program of Jiangsu (BK20240033, BK20243057), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. SJCX21_0252).
Publisher's Copyright Statement
- This full text is made available under CC-BY-NC-ND 4.0. https://creativecommons.org/licenses/by-nc-nd/4.0/
Fingerprint
Dive into the research topics of 'Intrinsic gradient oxygen-driven second-order memristors for continual reinforcement learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver