Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2021, Vol. 57 ›› Issue (4): 595-604.DOI: 10.13209/j.0479-8023.2021.054

Previous Articles     Next Articles

Hardware Optimization and Evaluation for Crucial Modules of Lattice-Based Cryptography

CHEN Zhaohui1,2, MA Yuan2,3,†, JING Jiwu1,4   

  1. 1. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049 2. State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093 3. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049 4. School of Software and Microelectronics, Peking University, Beijing 102600
  • Received:2020-06-18 Revised:2020-09-24 Online:2021-07-20 Published:2021-07-20
  • Contact: MA Yuan, E-mail: mayuan(at)


陈朝晖1,2, 马原2,3,†, 荆继武1,4   

  1. 1. 中国科学院大学计算机科学与技术学院, 北京 100049 2. 中国科学院信息工程研究所信息安全国家重点实验室, 北京 100093 3. 中国科学院大学网络空间安全学院, 北京 100049 4. 北京大学软件与微电子学院, 北京 102600
  • 通讯作者: 马原, E-mail: mayuan(at)
  • 基金资助:
    国家自然科学基金(61872357, 61802396)、国家密码发展基金(MMJJ20170205, MMJJ20180113)和北京数字认证股份有限公司委托研究项 目(BJCA2020-YF-0300)资助


To improve the efficiency of lattice-based cryptography in practical applications, the optimization technology of polynomial multiplication in lattice-based cryptography is proposed. The polynomial coefficients are stored in a ping-pong structure to improve the bandwidth. By eliminating pre-scale operations, 10.5% of modular multiplication operations and 16.7% of storage space are saved. The structure based on look-up table shift register and three-input adder is adopted to reduce the logical resource occupation. The pipeline structure with optional stages is designed to make the butterfly module in polynomial multiplication meet the timing requirements of different cryptographic hardware systems. The evaluation results show that the maximum frequency of low-area, balanced and high-performance implementations of the optimized butterfly unit can reach 150, 250 and 350 MHz, respectively. Compared with the existing implementation technologies, the optimized hardware implementation can achieve higher operating frequency with a smaller circuit area, which improves the efficiency of polynomial multiplication module by 22.8%. 

Key words: post-quantum cryptography, polynomial multiplication, number theoretic transformation, butterfly operation, FPGA


为提高格密码在实际应用中的运算效率, 提出一种格密码中多项式乘法运算的优化实现技术。该技术采用乒乓结构存储多项式系数, 用以提升存取带宽, 通过消除预缩放运算, 减少10.5%的模乘运算和16.7%的存储空间占用, 采用移位寄存器和三输入加法器的结构, 有效地减少逻辑资源占用。同时, 设计具有可选层级的流水线结构, 使多项式乘法中的蝶形运算模块可以满足不同密码硬件系统的时序要求。评估结果表明, 采用优化技术的低面积、均衡型和高性能实现的蝶形运算模块最大工作频率分别可达到150, 250和350 MHz以上。与现有实现技术相比, 优化的多项式乘法硬件实现能够以更小的电路面积实现更高的工作频率, 使电路效率提升22.8%。

关键词: 后量子密码, 多项式乘法, 数论变换, 蝶形运算, FPGA