Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2017, Vol. 53 ›› Issue (2): 204-210.DOI: 10.13209/j.0479-8023.2017.027

• Orginal Article • Previous Articles     Next Articles

Word Sense Disambiguation Based on Domain Knowledge and Word Vector Model

An YANG1, Sujian LI1,2,(), Yun LI3   

  1. 1. Key Laboratory of Computational Linguistics (Peking University), MOE, Beijing 100871
    2. Collaborative Innovation Center for Language Ability, Xuzhou 221009
    3. Institute of Linguistics, Chinese Academy of Social Sciences, Beijing 100732;
  • Received:2016-07-29 Revised:2016-09-23 Online:2017-03-20 Published:2017-03-20
  • Contact: Sujian LI

基于领域知识和词向量的词义消歧方法

杨安1, 李素建1,2,(), 李芸3   

  1. 1. 北京大学计算语言学教育部重点实验室, 北京 100871
    2. 语言能力协同创新中心, 徐州 221009
    3. 中国社会科学院语言研究所, 北京 100732
  • 通讯作者: 李素建
  • 基金资助:
    国家自然科学基金(61273278, 61572049)资助

Abstract:

A WSD method is presented, using domain keywords and word vector model built from unlabelled data. The effectiveness of the proposed approach is proved, compared with other WSD methods including Lesk on evaluation corpus in environmental domain. Through employing knowledge from different fields, proposed method can be adapted into the WSD task of other domains.

Key words: word sense disambiguation (WSD), word vector model, domain knowledge

摘要:

利用无标注文本构建词向量模型, 结合特定领域的关键词信息, 提出一种词义消歧方法。以环境领域的待消歧文本作为评测语料, 通过与Lesk等其他消歧方法进行比较, 证明了所提方法的有效性。通过引入不同的领域知识, 证明该方法亦可在其他领域的文本消歧任务中加以应用。

关键词: 词义消歧, 词向量模型, 领域知识

CLC Number: