Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2025, Vol. 61 ›› Issue (4): 629-638.DOI: 10.13209/j.0479-8023.2024.121

Previous Articles     Next Articles

A Multimodal Cross-Attention Model for Alzheimer’s Disease Diagnosis

LI Zhou1, LIU Yongbin1,†, OUYANG Chunping1, ZHANG Jiangtao2, PAN Xue1, JIANG Lu1, ZHONG Jin1   

  1. 1. School of Computer, University of South China Hengyang, Hengyang 421001; 2. The 305th Hospital of the Chinese People’s Liberation Army, Beijing 100017
  • Received:2024-05-30 Revised:2024-07-25 Online:2025-07-20 Published:2025-07-20
  • Contact: LIU Yongbin, E-mail: yongbinliu03(at)gmail.com

基于多模态交叉注意力的阿尔茨海默症辅助诊断研究

李舟1, 刘永彬1,†, 欧阳纯萍1, 张江涛2, 潘雪1, 江璐1, 钟进1   

  1. 1. 南华大学计算机学院, 衡阳 421001 2. 解放军第三〇五医院, 北京 100017
  • 通讯作者: 刘永彬, E-mail: yongbinliu03(at)gmail.com
  • 基金资助:
    国家自然科学基金(61533018)、湖南省自然科学基金(2022JJ30495, 2025JJ50384)、湖南省教育厅重点科研项目(22A0316)、湖南省研究生科研创新项目(CX20240833)和中国中文信息学会社会媒体处理专委会(SMP)–智谱大模型交叉学科基金资助

Abstract:

In order to achieve accurate computer-aided diagnosis of Alzheimer’s disease (AD)and mild cognitive impairment (MCI) patients, this paper proposes a multimodal Alzheimer’s multi-class diagnostic framework (MAMDF) that uses an asymmetric cross-attention mechanism for multimodal fusion to better reveal the relationship between clinical data and medical imaging data. Moreover, to address the two MCI subtypes that are rarely mentioned in previous computer-aided diagnosis work, we combined frequency-domain transformers and Transformers to propose a novel deep feature extraction module for feature fusion. This method captures the internal connections of fused features and obtains richer multimodal joint representations, thus improving the diagnostic performance of the model on the two MCI subtypes. Experimental results on the ADNI dataset show that the proposed model achieves higher accuracy and F1 scores, compared with similar works. Thus the model can more effectively handle multimodal data fusion and mine the deep feature relationships between different modal medical data, thereby better integrating and analyzing the multimodal information of AD patients. 

Key words: multi-modal deep learning, Alzheimer’s disease diagnosis, cross-attention mechanism

摘要:

为了对阿尔茨海默症和轻度认知障碍患者进行准确的辅助诊断, 提出一种利用非对称交叉注意力机制进行多模态融合的阿尔茨海默症多分类诊断框架MAMDF, 以便更好地揭示临床数据和医疗成像数据之间的关系。针对计算机辅助诊断工作中很少提及的两种轻度认知障碍亚型, 结合频域转换器和Transformer, 提出一种新颖的深度特征提取方法, 用于处理特征融合。该方法能够捕获融合特征的内部联系, 获取更丰富的多模态联合表示, 从而使模型在两种轻度认知障碍亚型上的诊断表现更好。在ADNI数据集上实验结果表明, 与其他方法相比, 该模型取得更高的准确率和F1值, 可以更有效地处理多模态数据融合, 挖掘不同模态医疗数据间的深层特征关系, 从而能更好地整合并分析阿尔茨海默症患者的多模态信息。

关键词: 多模态深度学习, 阿尔茨海默症诊断, 交叉注意力机制