Algorithm-Hardware Co-design of Split-Radix Discrete Galois Transformation for KyberKEM
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 10114669 |
Pages (from-to) | 824-838 |
Journal / Publication | IEEE Transactions on Emerging Topics in Computing |
Volume | 11 |
Issue number | 4 |
Online published | 2 May 2023 |
Publication status | Published - Oct 2023 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85159662321&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(d027a4b5-948e-4836-a64c-1b6f932c9bff).html |
Abstract
KyberKEM is one of the final round key encapsulation mechanisms in the NIST post-quantum cryptography competition. Number theoretic transform (NTT), as the computing bottleneck of KyberKEM, has been widely studied. Discrete Galois Transformation (DGT) is a variant of NTT that reduces transform length into half but requires more multiplication operations than the latest NTT algorithm in theoretical analysis. This paper proposes the split-radix DGT, a novel DGT variant utilizing the split-radix method, to reduce the computing complexity without compromising the transform length. Specifically, for length-128 polynomial, the split-radix DGT algorithm saves at least 10% multiplication operations compared with the latest NTT algorithm in theoretical analysis. Furthermore, we proposed a unified split-radix DGT processor with the dedicated stream permutation network for KyberKEM and implemented it on the Xilinx Artix-7 FPGA. The processor achieves at least 49.4% faster transformation and 65.3% faster component-wise multiplication, with at most 87% and 32% LUT-NTT area-time product and LUT-CWM area-time product, compared with the state-of-the-art polynomial multipliers in KyberKEM with the same BFU setting on similar platforms. Lastly, we designed a highly efficient KyberKEM architecture using the proposed split-radix DGT processor. The implementation results on Artix-7 FPGA show significant performance improvements over the state-of-the-art KyberKEM designs.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Research Area(s)
- Discrete Galois transform, Split-radix, Negative wrapped convolution, Post-Quantum cryptography, Key encapsulation mechanism, Hardware, FPGA
Citation Format(s)
Algorithm-Hardware Co-design of Split-Radix Discrete Galois Transformation for KyberKEM. / Li, Guangyan; Chen, Donglong; Mao, Gaoyu et al.
In: IEEE Transactions on Emerging Topics in Computing, Vol. 11, No. 4, 10114669, 10.2023, p. 824-838.
In: IEEE Transactions on Emerging Topics in Computing, Vol. 11, No. 4, 10114669, 10.2023, p. 824-838.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Download Statistics
No data available