北京大学学报自然科学版 ›› 2023, Vol. 59 ›› Issue (6): 909-914.DOI: 10.13209/j.0479-8023.2023.085

上一篇    下一篇

基于Transformer模型的手写数学公式语法树解码器

周伯瀚, 曹健, 王源   

  1. 北京大学软件与微电子学院, 北京 102600
  • 收稿日期:2022-11-16 修回日期:2023-01-10 出版日期:2023-11-20 发布日期:2023-11-20
  • 通讯作者: 曹健, E-mail: caojian(at)ss.pku.edu.cn

A Transformer-based Syntax Tree Decoder for Handwritten Mathematical Expression Recognition

ZHOU Bohan, CAO Jian, WANG Yuan   

  1. School of Software and Microelectronics, Peking University, Beijing 102600
  • Received:2022-11-16 Revised:2023-01-10 Online:2023-11-20 Published:2023-11-20
  • Contact: CAO Jian, E-mail: caojian(at)ss.pku.edu.cn

摘要:

目前对数学公式进行树结构解码的方法大多基于循环神经网络的结构, 训练效率低, 训练过程复杂, 基于此问题, 提出一种基于Transformer结构的手写数学公式识别模型, 可以直接对公式的语法树进行解码。在手写公式识别任务多个数据集上的实验结果表明, 所提出的Transformer树解码方法都取得超越Transformer序列解码方法的性能, 并展现出超越循环神经网络树解码方法的潜力。

关键词: 手写数学公式识别, Transformer, 树解码器, 图表理解

Abstract:

Most of the existing tree-structured decoding methods of handwritten mathematical expression recognition are based on the recurrent neural networks, which have low training efficiency and complicated training process. In order to prove this problem, the authors propose a handwritten mathematical expression recognition model based on Transformer structure, which can decode the syntax tree of expressions directly. Experimental results show that the proposed tree-structured decoding method achieves better performance than the string decoding methods base on Transformer on several datasets of handwritten formula recognition tasks, and show the potential to surpass recurrent neural network tree decoding methods.

Key words: handwritten mathematical expression recognition, Transformer, tree decoder, document comprehension