北京大学学报自然科学版 ›› 2017, Vol. 53 ›› Issue (2): 247-254.DOI: 10.13209/j.0479-8023.2017.031

上一篇    下一篇

机器学习与语义规则融合的微博情感分类方法

姜杰, 夏睿()   

  1. 南京理工大学计算机科学与工程学院, 南京 210094
  • 收稿日期:2016-08-02 修回日期:2016-09-25 出版日期:2017-03-20 发布日期:2017-03-20
  • 通讯作者: 夏睿
  • 基金资助:
    国家自然科学基金(61672288, 61305090)、江苏省优秀青年基金(BK20160085)和软件新技术国家重点实验室开放基金资助

Microblog Sentiment Classification via Combining Rule-based and Machine Learning Methods

Jie JIANG, Rui XIA()   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094
  • Received:2016-08-02 Revised:2016-09-25 Online:2017-03-20 Published:2017-03-20
  • Contact: Rui XIA

摘要:

针对现有文本情感分析方法的不足, 设计了一种针对中文微博的基于词典的规则情感分类方法和用于机器学习方法的基本特征模板。提出一种机器学习与规则相融合的微博情感分类方法, 将用规则方法得到的多样化情感信息进行转化, 扩展并嵌入基本特征模板, 形成更有效的融合特征模板。通过 3 种分类模型集成, 提高微博情感分类的性能。

关键词: 微博情感分析, 机器学习, 规则方法, 特征嵌入, 系统融合

Abstract:

Based on the shortcomings of sentiment analysis, this paper implemented a rule-based sentiment classification method and designed a basic feature set for machine learning methods. A sentiment analysis method via a combination of rule-based and machine learning methods is proposed. An effective integration feature set is obtained by adding various rule-based features to the basic feature set after expanding and converting them. The proposed method outperforms the baseline of any single method. Finally ensemble of three different classifiers is used to make further improvement on the performance of microblog sentiment classification.

Key words: microblog sentiment analysis, machine learning, rule-based method, feature embedding, system combination

中图分类号: