北京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (1): 17-24.DOI: 10.13209/j.0479-8023.2016.003

上一篇    下一篇

基于主题敏感的重启随机游走实体链接方法

李茂林   

  1. 北京邮电大学智能科学与技术中心, 北京100876
  • 收稿日期:2015-06-07 出版日期:2016-01-20 发布日期:2016-01-20
  • 通讯作者: 李茂林, E-mail: mlli(at)bupt.edu.cn

An Entity Linking Approach Based on Topic-Sensitive Random Walk with Restart

LI Maolin   

  1. Center for Intelligence Science and Technology, Beijing University of Posts and Telecommunications, Beijing 100876
  • Received:2015-06-07 Online:2016-01-20 Published:2016-01-20
  • Contact: LI Maolin, E-mail: mlli(at)bupt.edu.cn

摘要:

实体链接任务的目的是将文本中的实体指称链接到知识库中与之对应的无歧义实体。针对此任务, 提出一种基于主题敏感的重启随机游走的实体链接方法。该方法首先使用实体指称的背景文本信息将实体指称扩充为全称, 并在维基百科知识库中搜索候选实体, 得到候选实体集合; 根据上述中间结果构建图, 利用在图上的主题敏感重启随机游走得到的平稳分布对候选实体集合进行排序, 选出top 1 的候选实体作为目标实体。实验结果表明, 该方法在KBP2014 实体链接数据集上实验的F 值为0.623, 高于其他系统实验的F值, 能够有效提高实体链接系统的整体性能。

关键词: 实体链接, 随机游走, 维基百科

Abstract:

Entity linking is the process of linking name mentions in text with their referent entities in a knowledge base. This paper tackles this task by proposing an approach based on topic-sensitive random walk with restart. Firstly, the context information of mentions is used to expand mentions and search the candidate entities in Wikipedia knowledge base for mentions. Secondly, graph can be constructed in accordance with the intermediate result in the pre step. Finally, the topic-sensitive random walk with restart model is used to rank the candidate entities and choose the top 1 as the linked entity. Experimental results show that proposed approach on KBP2014 data set gets F score 0.623 which is higher than every other systems’ mentioned in this paper. The proposed approach can improve the entity linking system’s performance.

Key words: entity linking, random walk, Wikipedia

中图分类号: