Skip to main navigation Skip to search Skip to main content

SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba

Yulong Huang (Co-first Author), Jianxiong Tang (Co-first Author), Chao Wang (Co-first Author), Ziyi Wang, Jianguo Zhang, Zhichao Lu, Bojun Cheng*, Luziwei Leng*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

Large Language Models (LLMs) have achieved remarkable performance across tasks but remain energy-intensive due to dense matrix operations. Spiking neural networks (SNNs) improve energy efficiency by replacing dense matrix multiplications with sparse accumulations. Their sparse spike activity enables efficient LLMs deployment on edge devices. However, prior SNN-based LLMs often sacrifice performance for efficiency, and recovering accuracy typically requires full pretraining, which is costly and impractical. To address this, we propose SpikingMamba, an energy-efficient SNN-based LLMs distilled from Mamba that improves energy efficiency with minimal accuracy sacrifice. SpikingMamba integrates two key components: (a) SI-LIF, a signed-integer spiking neuron that preserves semantic polarity through signed multi-level spike representations. (b) A training-exclusive Smoothed Gradient Compensation (SGC) path mitigating quantization loss while preserving spike-driven efficiency. We employ a single-stage distillation strategy to transfer the zero-shot ability of pretrained Mamba and further enhance it via reinforcement learning (RL). Experiments show that SpikingMamba-1.3B achieves a 4.76× energy benefit, with only a 4.78% zero-shot accuracy gap compared to the original Mamba. The model achieves a further 2.55% accuracy improvement after RL, narrowing the performance gap from 4.78% to 2.23%. © 2026, Transactions on Machine Learning Research. All rights reserved.
Original languageEnglish
Number of pages21
JournalTransactions on Machine Learning Research
Volume2026-January
Online published20 Jan 2026
Publication statusPublished - Jan 2026

Funding

This work is supported in part by the Science and Technology Innovation 2030-Major Project (Brain Science and Brain-Like Intelligence Technology) under Grant 2022ZD0208700, the National Key Research and Development Program of China (2021YFF1200800), the National Natural Science Foundation of China (Grant No.62276121, 12326604), Young Scientists Fund of the National Natural Science Foundation of China (Grant 62305278), The Guangdong Basic and Applied Basic Research Foundation (NO.2025A1515011758) and Youth S&T Talent Support Programme of Guangdong Provincial Association for Science and Technology (SKXRC2025460).

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Fingerprint

Dive into the research topics of 'SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba'. Together they form a unique fingerprint.

Cite this