北京大学学报(自然科学版)

多策略中文微博细粒度情绪分析研究

欧阳纯萍,阳小华,雷龙艳,徐强,余颖,刘志明   

  1. 南华大学计算机科学与技术学院, 衡阳421001;
  • 收稿日期:2013-07-09 出版日期:2014-01-20 发布日期:2014-01-20

Multi-strategy Approach for Fine-Grained Sentiment Analysis of Chinese Microblog

OUYANG Chunping, YANG Xiaohua, LEI Longyan, XU Qiang, YU Ying, LIU Zhiming   

  1. School of Comupter Science and Technology, University of South China, Hengyang 421001;
  • Received:2013-07-09 Online:2014-01-20 Published:2014-01-20

摘要: 针对中文微博用户的情绪分析问题, 提出一种基于多策略融合的细粒度情绪分析方法。首先采用朴素贝叶斯算法对微博的有无情绪分类问题进行研究, 然后构建有情绪微博的21维特征向量, 最后采用SVM和KNN算法对微博进行细粒度情绪分析。以新浪微博作为实验对象, 结果表明多策略集成方法好于单一分类 算法。在多策略集成方法中, “NB+SVM”方法略优于“NB+KNN”方法。

关键词: 细粒度情绪分析, 中文微博, 朴素贝叶斯, SVM, KNN

Abstract: Fine-grained sentiment analysis of Chinese microblog is investigated and a method of multi-strategy fusion is proposed. Firstly, the authors apply naive Bayesian to identify sentiment or non-sentiment about microblog. Secondly, based on emotion ontology, a method for how to form 21 sentiment features vectors of microblog is presented. At last, fine-grained sentiment of microblog is classified based on SVM and KNN respectively. Experiment results show that multi-strategy fusion is better than a single method, in addition, “NB+SVM” strategy is better than “NB+KNN” strategy.

Key words: fine-grained sentiment analysis, Chinese microblog, naive Bayesian, support vector machine (SVM), K Nearest Neighbor (KNN)

中图分类号: