北京大学学报自然科学版 ›› 2024, Vol. 60 ›› Issue (3): 413-421.DOI: 10.13209/j.0479-8023.2024.036

上一篇    下一篇

基于标签语义信息感知的少样本命名实体识别方法

张越1, 王长征2, 苏雪峰1,3, 闫智超1, 张广军1, 邵文远1, 李茹1,4,†   

  1. 1. 山西大学计算机与信息技术学院, 太原 030006 2. 山西同方知网数字出版技术有限公司, 太原 030032 3. 山西工程科技职业大学现代物流学院, 晋中 030609 4. 山西大学计算智能与中文信息处理教育部重点实验室, 太原 030006
  • 收稿日期:2023-05-19 修回日期:2023-07-30 出版日期:2024-05-20 发布日期:2024-05-20
  • 通讯作者: 李茹, E-mail: liru(at)sxu.edu.cn
  • 基金资助:
    山西省重点研发计划(202102020101008)、山西省科技合作交流专项(202204041101016)和山西省1331工程项目资助

Few-shot Named Entity Recognition Method Based on Semantic Information Awareness of Labels

ZHANG Yue1, WANG Changzheng2, SU Xuefeng1,3, YAN Zhichao1, ZHANG Guangjun1, SHAO Wenyuan1, LI Ru1,4,†   

  1. 1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006 2. Shanxi Tongfang Knowledge Network Digital Publishing Technology Co., Ltd., Taiyuan 030032 3. School of Modern Logistics, Shanxi Vocational University of Engineering Science and Technology, Jinzhong 030609 4. Key Laboratory Computational Intelligence and Chinese Information Processing of Ministry of Education, Taiyuan 030006
  • Received:2023-05-19 Revised:2023-07-30 Online:2024-05-20 Published:2024-05-20
  • Contact: LI Ru, E-mail: liru(at)sxu.edu.cn

摘要:

在少样本命名实体识别方法中, 目前广泛应用的方法是基于原型网络的两阶段模型。但是, 该方法未充分利用实体标签中的语义信息, 且在距离计算中过度依赖实体类型原型向量, 导致模型泛化能力差。针对这些问题, 提出一种基于标签语义信息感知的少样本命名实体识别方法。该方法是一种先进行实体跨度检测, 再判断实体类型的两阶段方法。在构建实体类型原型向量时, 将对应实体类型包含的语义信息考虑在内, 通过维度转换层将其与原型向量相融合。在对新样本进行实体识别时, 将实体类型的正负样本与实体类型原型向量组成实体类型三元组, 依据样本到三元组的距离对其进行分类。在多个数据集上的实验结果证明, 该模型的性能比以往的模型有较大的提升。

关键词: 少样本命名实体识别, 标签语义信息感知, 实体类型三元组, 原型网络

Abstract:

Among various approaches of few-shot named entity recognition (NER), two-stage models based on prototype networks are widely used. However, these methods can not fully utilize the semantic information in entity labels and overly relies on entity type prototype vectors in distance calculation, resulting in poor generalization ability of the model. To address these issues, this paper proposes a few-shot named entity recognition method based on label semantic information awareness. This method consists of a two-stage process: entity span detection and entity type classification. When constructing entity type prototype vectors, the semantic information associated with the corresponding entity types is considered and fused with the prototype vectors through a dimension transformation layer. During the entity recognition of new samples, entity type positive and negative samples are combined with entity type prototype vectors to form entity type triplets, and the samples are classified based on the distance to the triplets. Experimental results on multiple datasets demonstrate that the proposed model significantly outperforms previous models.

Key words: few-shot named entity recognition (NER), semantic information awareness of labels, entity type triplet; prototypical network