Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2024, Vol. 60 ›› Issue (6): 1001-1008.DOI: 10.13209/j.0479-8023.2024.089

Previous Articles     Next Articles

Implementation of an Improved LeNet Traffic Sign Multi-classification Heterogeneous Accelerator

YANG Yongjie, ZHENG Juntai, MA Li, YANG Hao   

  1. School of Information Science and Technology, Nantong University, Nantong 226019
  • Received:2023-12-18 Revised:2024-03-11 Online:2024-11-20 Published:2024-11-20
  • Contact: YANG Yongjie, E-mail: yang.yj(at)ntu.edu.cn

一种改进型LeNet的交通标识多分类异构加速器的实现

杨永杰, 郑君泰, 马立, 杨昊   

  1. 南通大学信息科学技术学院, 南通 226019
  • 通讯作者: 杨永杰, E-mail: yang.yj(at)ntu.edu.cn

Abstract:

An implementation of traffic sign multi-classification heterogeneous accelerator based on improved LeNet is proposed. The accelerator utilizes an ARM+FPGA heterogeneous platform to deploy the forward inference of the improved LeNet on the FPGA for parallel computing. On the FPGA side, the AXI-Stream protocol is employed with DMA to achieve high-speed data streaming, and techniques such as array partitioning and multi-level pipeline are utilized for parallel data processing. On the ARM side, the PYNQ framework is used for data updates and accelerator scheduling. Experimental results on GTSRB demonstrate that proposed design achieves an average inference time of 14.489 ms at a working clock frequency of 50 MHz, compared to 710 ms on the MCU, resulting in a speedup of up to 49 times. This design provides significant assistance for edge applications involving traffic sign multi-classification.

Key words: LeNet, FPGA, PYNQ, heterogeneous computing

摘要:

提出一种基于改进型LeNet的交通标志多分类异构加速器的实现方案。该加速器利用ARM+FPGA异构平台, 将改进型LeNet的前向推理部署到FPGA上, 实现并行计算。在FPGA端, 采用AXI-Stream协议, 通过DMA实现数据高速流转, 使用数组分区和多级流水线 等技术实现数据的并行处理。 在ARM端使用PYNQ框架进行数据更新和加速器调度。在GTSRB数据集上的实验结果显示, 该设计方案在工作时钟频率为50 MHz时, 平均推理时间为14.489 ms, 在MCU上的推理时间为710 ms, 加速比可达49, 对于实现交通标识多分类的边缘端应用具有显著的作用。

关键词: LeNet, FPGA, PYNQ, 异构计算