Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2025, Vol. 61 ›› Issue (6): 1057-1063.DOI: 10.13209/j.0479-8023.2025.092

Previous Articles     Next Articles

Practice and Performance of Large Language Models in Medical Data Desensitization

ZHANG Zhili1, YANG Hong2, PANG Juan1, HENG Fanxiu1,†   

  1. 1. Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Information Center, Peking University Cancer Hospital & Institute, Beijing 100142 2. Nursing Department, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital & Institute, Beijing 100142
  • Received:2024-10-30 Revised:2025-02-05 Online:2025-11-20 Published:2025-11-20
  • Contact: HENG Fanxiu, E-mail: 13501365120(at)163.com

大语言模型在医疗数据脱敏中的实践与表现

张志立1, 杨红2, 庞娟1, 衡反修1,†    

  1. 1. 北京大学肿瘤医院暨北京市肿瘤防治研究所信息技术服务部, 恶性肿瘤发病机制及转化研究教育部重点实验室, 北京 100142 2. 北京大学肿瘤医院暨北京市肿瘤防治研究所护理部, 北京 100142
  • 通讯作者: 衡反修, E-mail: 13501365120(at)163.com

Abstract:

To explore the performance of different mainstream big language models in desensitizing medical documents, a medical desensitization system based on large language models is designed. Mainstream open-source big language models Gemma2, Llama3, Qwen2, and Mistral are adopted as research objects, using the big language model management framework tool Olama for private deployment. By constructing a unified Prompt engineering template as input for the big language model, the ability of the big language model is called in the form of an interface to obtain target sensitive words in medical documents. Sensitive words are replaced to complete the desensitization of medical documents. On the virtual private server, the recognition of sensitive words in a single medical document can be completed within 52.420 to 123.380 seconds. In terms of performance in five types of medical document anonymizations, 12 types of sensitive word recognition, and the processing efficiency of large language models, Gemma2 demonstrates the best overall performance, followed by Llama3, Qwen2, and Mistral. The results show that even without GPU computing power, virtual dedicated servers can still complete sensitive word recognition and processing with high quality by deploying large language models, greatly improving the accuracy of medical document desensitization. 

Key words: large language model, privacy protection, data desensitization, electronic medical records, virtual private server 

摘要:

为了有效地保护患者隐私与数据安全, 探索不同主流大语言模型在医疗文书中脱敏的表现, 设计一套基于大语言模型的医疗脱敏系统。该系统分别以主流开源大语言模型Gemma2, Llama3, Qwen2和Mistral为研究对象, 使用大语言模型管理框架工具 Ollama进行私有化部署; 通过构建统一Prompt提示工程模版作为大语言模型输入, 以接口形式调用大语言模型能力获取医疗文书中目标敏感词, 然后对医疗文书进行敏感词替换, 完成医疗文书的脱敏工作。在虚拟专用服务器上, 单份医疗文书敏感词识别均可在52.420~123.380 s内完成; 在5类医疗文书脱敏、12类敏感词识别以及大语言模型的处理实效性上, Gemma2的整体表现最佳, 其后依次为Llama3, Qwen2和Mistral。结果表明, 在无GPU算力的情况下, 虚拟专用服务器可通过部署大语言模型来高质量地完成敏感词识别和处理, 可以极大地提高医疗文书脱敏的准确性。

关键词: 大语言模型, 隐私保护, 数据脱敏, 电子病历, 虚拟专用服务器