TY - JOUR
T1 - Parameter Space for the Architecture of FFT-Based Montgomery Modular Multiplication
AU - Chen, Donald Donglong
AU - Yao, Gavin Xiaoxu
AU - Cheung, Ray C.C.
AU - Pao, Derek
AU - Koç, Çetin Kaya
PY - 2016/1
Y1 - 2016/1
N2 - Modular multiplication is the core operation in public-key cryptographic algorithms such as RSA and the Diffie-Hellman algorithm. The efficiency of the modular multiplier plays a crucial role in the performance of these cryptographic methods. In this paper, improvements to FFT-based Montgomery Modular Multiplication (FFTM3) using carry-save arithmetic and pre-computation techniques are presented. Moreover, pseudo-Fermat number transform is used to enrich the supported operand sizes for the FFTM3. The asymptotic complexity of our method is O(l log, l log log l), which is the same as the Schönhage-Strassen multiplication algorithm (SSA). A systematic procedure to select suitable parameter set for the FFTM3 is provided. Prototypes of the improved FFTM3 multiplier with appropriate parameter sets are implemented on Xilinx Virtex-6 FPGA. Our method can perform 3,100-bit and 4,124-bit modular multiplications in 6.74 and 7.78 μs, respectively. It offers better computation latency and area-latency product compared to the state-of-the-art methods for operand size of 3,072-bit and above. © 2015 IEEE
AB - Modular multiplication is the core operation in public-key cryptographic algorithms such as RSA and the Diffie-Hellman algorithm. The efficiency of the modular multiplier plays a crucial role in the performance of these cryptographic methods. In this paper, improvements to FFT-based Montgomery Modular Multiplication (FFTM3) using carry-save arithmetic and pre-computation techniques are presented. Moreover, pseudo-Fermat number transform is used to enrich the supported operand sizes for the FFTM3. The asymptotic complexity of our method is O(l log, l log log l), which is the same as the Schönhage-Strassen multiplication algorithm (SSA). A systematic procedure to select suitable parameter set for the FFTM3 is provided. Prototypes of the improved FFTM3 multiplier with appropriate parameter sets are implemented on Xilinx Virtex-6 FPGA. Our method can perform 3,100-bit and 4,124-bit modular multiplications in 6.74 and 7.78 μs, respectively. It offers better computation latency and area-latency product compared to the state-of-the-art methods for operand size of 3,072-bit and above. © 2015 IEEE
KW - field-programmable gate array (FPGA)
KW - Montgomery modular multiplication
KW - number theoretic transform (NTT)
KW - parallel computation
KW - Schonhage-Strassen Algorithm
UR - http://www.scopus.com/inward/record.url?scp=84961761068&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-84961761068&origin=recordpage
U2 - 10.1109/TC.2015.2417553
DO - 10.1109/TC.2015.2417553
M3 - RGC 21 - Publication in refereed journal
SN - 0018-9340
VL - 65
SP - 147
EP - 160
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 1
ER -