Status | 已发表Published |
Title | Revisiting Keccak and Dilithium Implementations on ARMv7-M |
Creator | |
Date Issued | 2024-03-12 |
Source Publication | IACR Transactions on Cryptographic Hardware and Embedded Systems
![]() |
ISSN | 25692925 |
Volume | 2024Issue:2Pages:1-24 |
Abstract | Keccak is widely used in lattice-based cryptography (LBC) and its impact to the overall running time in LBC scheme can be predominant on platforms lacking dedicated SHA-3 instructions. This holds true on embedded devices for Kyber and Dilithium, two LBC schemes selected by NIST to be standardized as quantum-safe cryptographic algorithms. While extensive work has been done to optimize the polynomial arithmetic in these schemes, it was generally assumed that Keccak implementations were already optimal and left little room for enhancement. In this paper, we revisit various optimization techniques for both Keccak and Dilithium on two ARMv7-M processors, i.e., Cortex-M3 and M4. For Keccak, we improve its efficiency using two architecture-specific optimizations, namely lazy rotation and memory access pipelining, on ARMv7-M processors. These optimizations yield performance gains of up to 24.78% and 21.4% for the largest Keccak permutation instance on Cortex-M3 and M4, respectively. As for Dilithium, we first apply the multi-moduli NTT for the small polynomial multiplication ct on Cortex-M3. Then, we thoroughly integrate the efficient Plantard arithmetic to the 16-bit NTTs for computing the small polynomial multiplications cs and ct on Cortex-M3 and M4. We show that the multi-moduli NTT combined with the efficient Plantard arithmetic could obtain significant speed-ups for the small polynomial multiplications of Dilithium on Cortex-M3. Combining all the aforementioned optimizations for both Keccak and Dilithium, we obtain 15.44% ∼ 23.75% and 13.94% ∼ 15.52% speed-ups for Dilithium on Cortex-M3 and M4, respectively. Furthermore, we also demonstrate that the Keccak optimizations yield 13.35% to 15.00% speed-ups for Kyber, and our Keccak optimizations decrease the proportion of time spent on hashing in Dilithium and Kyber by 2.46% ∼ 5.03% on Cortex-M4. |
Keyword | ARMv7-M Dilithium Keccak lattice-based cryptography Plantard arithmetic |
DOI | 10.46586/tches.v2024.i2.1-24 |
URL | View source |
Language | 英语English |
Scopus ID | 2-s2.0-85187790297 |
Citation statistics | |
Document Type | Journal article |
Identifier | http://repository.uic.edu.cn/handle/39GCC9TT/11485 |
Collection | Faculty of Science and Technology |
Corresponding Author | Chen, Donglong |
Affiliation | 1.Guangdong Provincial Key Laboratory IRADSBNU-HKBU United International College,Zhuhai,China 2.Hong Kong Baptist University,Hong Kong 3.Paris, France 4.Nanjing University of Aeronautics and Astronautics,Nanjing,China 5.Zhejiang Lab,Hangzhou,China 6.Sun Yat-sen University,Zhuhai,China 7.City University of Hong Kong,Hong Kong 8.Iǧdır University,Merkez,Turkey 9.University of California Santa Barbara,Santa Barbara,United States |
First Author Affilication | Beijing Normal-Hong Kong Baptist University |
Corresponding Author Affilication | Beijing Normal-Hong Kong Baptist University |
Recommended Citation GB/T 7714 | Huang, Junhao,Adomnicăi, Alexandre,Zhang, Jipenget al. Revisiting Keccak and Dilithium Implementations on ARMv7-M[J]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2024, 2024(2): 1-24. |
APA | Huang, Junhao., Adomnicăi, Alexandre., Zhang, Jipeng., Dai, Wangchen., Liu, Yao., .. & Chen, Donglong. (2024). Revisiting Keccak and Dilithium Implementations on ARMv7-M. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2024(2), 1-24. |
MLA | Huang, Junhao,et al."Revisiting Keccak and Dilithium Implementations on ARMv7-M". IACR Transactions on Cryptographic Hardware and Embedded Systems 2024.2(2024): 1-24. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment