Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2019, Vol. 55 ›› Issue (1): 98-104.DOI: 10.13209/j.0479-8023.2018.054

Previous Articles     Next Articles

Sentence Style Meta Learning for Twitter Classification

YAN Leiming, YAN Luqi , WANG Chaozhi, HE Jiahui , WU Hongyu   

  1. School of Computer and Software & Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044
  • Received:2018-04-21 Revised:2018-08-19 Online:2019-01-20 Published:2019-01-20
  • Contact: YAN Leiming, E-mail: lmyan(at)nuist.edu.cn

基于句式元学习的Twitter分类

闫雷鸣, 严璐绮, 王超智, 贺嘉会, 吴宏煜   

  1. 南京信息工程大学计算机与软件学院, 江苏省网络监控工程中心, 南京 210044
  • 通讯作者: 闫雷鸣, E-mail: lmyan(at)nuist.edu.cn
  • 基金资助:
    国家自然科学基金(61772281, 61703212, 61602254)资助

Abstract:

Due to the limited length and freely constructed sentence structures, it is a difficult classification task for short text classification, especially in multi-class classification. An efficient meta learning framework is proposed for twitter classification. The tweets are clustered into many sentence styles corresponding to new class labels. Thus, the original text classification task becomes few-shot learning task. When applying few-shot learning on benchmark datasets, the proposed method Meta-CNN achieves improvement in accuracy and F1 scores on multi-class twitter classification, and outweigh some traditional machine learning methods and a few deep learning approaches.

Key words: meta learning, few-shot learning, sentiment analysis, CNN

摘要:

针对多类别的社交媒体短文本分类准确率较低问题, 提出一种学习多种句式的元学习方法, 用于改善Twitter文本分类性能。将Twitter文本聚类为多种句式, 各句式结合原类标签, 成为多样化的新类别, 从而原分类问题转化为较多类别的few-shot学习问题, 并通过训练深层网络来学习句式原型编码。用多个三分类Twitter数据来检验所提Meta-CNN方法 , 结果显示, 该方法的学习策略简单有效, 即便在样本数量不多的情况下, 与传统机器学习分类器和部分深度学习分类方法相比, Meta-CNN仍能获得较好的分类准确率和较高的F1值。

关键词: 元学习, 少次学习, 情感分析, 卷积神经网络