北京大学学报(自然科学版)

基于话题分布相似度的无监督评论词消歧方法

郭瑛媚,史晓东,陈毅东,高燕   

  1. 厦门大学信息科学与技术学院, 厦门 361005;
  • 收稿日期:2012-05-31 出版日期:2013-01-20 发布日期:2013-01-20

Unsupervised Opinion Word Disambiguation Based on Topic Distribution Similarity

GUO Yingmei, SHI Xiaodong, CHEN Yidong, GAO Yan   

  1. School of Information Science and Engineering, Xiamen University, Xiamen 361005;
  • Received:2012-05-31 Online:2013-01-20 Published:2013-01-20

摘要: 基于话题信息、词的位置关系和互信息等特征, 提出一种无监督的跨语言词义消歧算法。该算法仅利用在线词典和web搜索引擎, 通过上下文信息选择评论句中多义评论词的词义。实验结果表明, 所提出的词义消歧算法具有较高准确率, 对于具有较多候选词义的评论词仍能表现出较好的性能。

关键词: 话题模型, 无监督, 评论词消歧

Abstract: The authors present an automatic method for choosing the correct sense of a polysemous word by using topic information, distance and mutual information of words. The only resources used in the method are an online dictionary and a web search engine. The sense of ambiguous opinion word can be broadly described from words in the context. Experiments show that new approach could achieve high accuracy, and especially keep superior performance for opinion words with more alternative senses.

Key words: topic model, unsupervised, opinion word disambiguation

中图分类号: