Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2024, Vol. 60 ›› Issue (1): 71-78.DOI: 10.13209/j.0479-8023.2023.076

Previous Articles     Next Articles

Radiology Report Generation Method Based on Multi-scale Feature Parsing

WANG Rui, LIANG Jianguo, HUA Rong   

  1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590
  • Received:2023-05-18 Revised:2023-07-31 Online:2024-01-20 Published:2024-01-20
  • Contact: HUA Rong, E-mail: huarong(at)


王瑞, 梁建国, 花嵘   

  1. 山东科技大学计算机科学与工程学院, 青岛 266590
  • 通讯作者: 花嵘, E-mail: huarong(at)


When using deep learning models to automatically generate radiology reports, due to the extreme imbalance of data, it is difficult for current models to identify abnormal regional features, which leads to misjudgment and missed judgment of the disease. In order to improve the model’s ability to identify diseases and improve the quality of reports, the authors use a multi-scale feature parsing Transformer (MFPT) model to generate radiology reports. Among them, a key feature enhanced attention (KFEA) module is constructed to strengthen the utilization of key features. A multi-modal feature fusion (MFF) module is designed to promote the feature fusion of semantic features and visual features and alleviate the impact caused by feature differences. This paper explores the role of stage-aware (SA) module in optimizing primary features in radiology reporting tasks. Finally, compared with the current mainstream models on the popular radiology report dataset IU X-Ray, the results show that the proposed model has achieved the current best effect.

Key words: attention mechanism, feature fusion, radiology report, Transformer, image-text generation


在使用深度学习模型自动生成放射学报告时, 由于数据的极度不平衡, 当前的模型难以识别异常区域特征, 从而导致对疾病的错判与漏判。为了提升模型对疾病的识别能力, 提高放射学报告的质量, 提出使用多尺度特征解析Transformer (MFPT)模型来生成放射学报告。构建一个关键特征强化注意力(KFEA)模块, 以便加强对关键特征的利用; 设计一个多模态特征融合(MFF)模块, 以便促进语义特征与视觉特征的特征融合, 缓解特征差异造成的影响; 探索阶段感知(SA)模块在放射学报告任务中对初级特征的优化作用。最后, 在流行的放射学报告数据集IU X-Ray上, 与当前的主流模型进行对比实验, 结果表明, 所提模型取得当前最佳效果。

关键词: 注意力机制, 特征融合, 放射学报告, Transformer, 图像–文本生成