北京大学学报自然科学版 ›› 2023, Vol. 59 ›› Issue (5): 757-763.DOI: 10.13209/j.0479-8023.2023.065

上一篇    下一篇

基于知识蒸馏的脉冲神经网络强化学习方法

张领, 曹健, 张袁, 冯硕, 王源   

  1. 北京大学软件与微电子学院, 北京 102600
  • 收稿日期:2022-10-07 修回日期:2023-01-13 出版日期:2023-09-20 发布日期:2023-09-18
  • 通讯作者: 曹健, E-mail: caojian(at)ss.pku.edu.cn
  • 基金资助:
    国家重点研发计划项目(2018YFE0203801)资助

Reinforcement Learning of Spiking Neural Network Based on Knowledge Distillation

ZHANG Ling, CAO Jian, ZHANG Yuan, FENG Shuo, WANG Yuan   

  1. School of Software and Microelectronics, Peking University, Beijing 102600
  • Received:2022-10-07 Revised:2023-01-13 Online:2023-09-20 Published:2023-09-18
  • Contact: CAO Jian, E-mail: caojian(at)ss.pku.edu.cn

摘要:

提出一种基于知识蒸馏的脉冲神经网络(SNN)强化学习方法SDN。该方法利用 STBP 梯度下降法, 实现深度神经网络(DNN)向SNN 强化学习任务的知识蒸馏。实验结果表明, 与传统的SNN强化学习和DNN强化学习方法相比, 该方法可以更快地收敛, 能获得比 DNN参数量更小的SNN强化学习模型。将SDN部署到神经形态学芯片上, 证明其功耗比DNN低, 是高性能的SNN强化学习方法, 可以加速SNN强化学习的收敛。

关键词: 脉冲神经网络, 强化学习, 知识蒸馏

Abstract:

We propose the reinforcement learning method of Spike Distillation Network (SDN), which uses STBP gradient descent method to realize the knowledge distillation from Deep Neural Network (DNN) to Spiking Neural Network (SNN) reinforcement learning tasks. Experiment results show that SDN converges faster than traditional SNN reinforcement learning and DNN reinforcement learning methods, and can obtain a SNN reinforcement learning model with smaller parameters than DNN. SDN is deployed to the neuromorphology chip, and the power consumption is lower than DNN, proving that SDN is a new and high-performance SNN reinforcement learning method and can accelerate the convergence of SNN reinforcement learning.

Key words: spiking neural network (SNN), reinforcement learning, knowledge distillation