北京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (1): 171-177.DOI: 10.13209/j.0479-8023.2016.004

上一篇    下一篇

基于词语情感隶属度特征的情感极性分类

宋佳颖, 黄旭, 付国宏   

  1. 黑龙江大学计算机科学技术学院, 哈尔滨 150080
  • 收稿日期:2015-06-06 出版日期:2016-01-20 发布日期:2016-01-20
  • 通讯作者: 付国宏, E-mail: ghfu(at)hotmail.com
  • 基金资助:
    国家自然科学基金(61170148)和黑龙江省人力资源和社会保障厅留学人员科技活动项目资助

Exploiting Lexical Sentiment Membership-Based Features to Polarity Classification

SONG Jiaying, HUANG Xu, FU Guohong   

  1. School of Computer Science and Technology, Heilongjiang University, Harbin 150080
  • Received:2015-06-06 Online:2016-01-20 Published:2016-01-20
  • Contact: FU Guohong, E-mail: ghfu(at)hotmail.com

摘要:

在模糊集合论框架下探索基于词语情感隶属度的情感极性分类特征表示方法。以TF-IDF为权重分别构建情感特征词语的正向、负向极性隶属度, 并以隶属度对数比作为分类特征值构建基于支持向量机的情感极性分类系统。在产品评论、NLPCC2014情感分类评测数据和IMDB英文影评等数据上的实验结果表明, 基于情感隶属度特征的系统优于基于布尔、频度和词向量等特征表示的系统, 验证了所提出的基于情感隶属度特征表示的有效性。

关键词: 情感极性分类, 模糊集合论, 隶属度, 支持向量机

Abstract:

A lexical sentiment membership based feature representation was presented for Chinese polarity classification under the framework of fuzzy set theory. TF-IDF weighted words are used to construct the corresponding positive and negative polarity membership for each feature word, and the log-ratio of each membership is computed. A support vector machines based polarity classifier is built with the membership logratios as its features. Furthermore, the classifier is evaluated over different datasets, including a corpus of reviews on automobile products, the NLPCC2014 data for sentiment classification evaluation and the IMDB film comments. The experimental results show that the proposed sentiment membership feature representation outperforms the state of the art feature representations such as the Boolean features, the frequent-based features and the word embeddings based features.

Key words: sentiment polarity classification, fuzzy sets, membership, supported vector machines

中图分类号: