北京大学学报自然科学版 ›› 2020, Vol. 56 ›› Issue (1): 68-74.DOI: 10.13209/j.0479-8023.2019.102

上一篇    下一篇

面向微博用户的消费意图识别算法

贾云龙1, 韩东红1,†, 林海原1, 王国仁2, 夏利1   

  1. 1. 东北大学计算机科学与工程学院, 沈阳 110819 2. 北京理工大学计算机学院, 北京100081
  • 收稿日期:2019-05-22 修回日期:2019-09-24 出版日期:2020-01-20 发布日期:2020-01-20
  • 通讯作者: 韩东红, E-mail: handonghong(at)mail.neu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2016YFC1401900)、国家自然科学基金(61173029, 61672144, 61872072)和计算机软件新技术国家重点实验室开放课题(KFKT2018)资助

Consumption Intent Recognition Algorithms for Weibo Users

JIA Yunlong1, HAN Donghong1,†, LIN Haiyuan1, WANG Guoren2, XIA Li1   

  1. 1. School of Computer Science and Engineering, Northeastern University, Shenyang 110819 2. College of Computer, Beijing Institute of Technology, Bejing 100081
  • Received:2019-05-22 Revised:2019-09-24 Online:2020-01-20 Published:2020-01-20
  • Contact: HAN Donghong, E-mail: handonghong(at)mail.neu.edu.cn

摘要:

利用迁移学习的方法, 融合京东问答平台数据与少量已标注的微博数据构建训练集, 提出一种基于注意力机制的双向长短期记忆神经网络(Attentional-Bi-LSTM)模型, 用于识别用户的隐性消费意图。针对显性意图识别问题, 提出一种结合TF-IDF (term frequency-inverse document frequency)与句法分析中动宾关系(VOB)的消费意图对象提取算法。实验结果表明, 通过将迁移京东问答平台的数据与微博数据相融合, 可以有效地扩充训练集, 在此基础上训练的神经网络分类模型具有较高的准确率和召回率; 融合VOB和TF-IDF的显性消费意图对象提取方法的准确率达到78.8%。

关键词: 消费意图识别, 意图对象提取, 迁移学习, 注意力机制

Abstract:

The data set is constructed by the data of Jingdong Question Answer Platform and Weibo based on transfer learning method and a bi-directional long-term and short-term memory neural network model based on attention mechanism is proposed to identify users’ implicit consumption intention. For the problem of explicit intention recognition, a new algorithm for extracting consumer intention objects is proposed, which combines TFIDF (term frequency-inverse document frequency) with the verb-object relationship (VOB) in parsing. The experimental results show that the training set can be effectively expanded by merging the data of Jingdong Question Answer Platform and Weibo. The classification model has high accuracy and recall rate, and the method of extracting explicit consumer intent objects by fusing VOB and TF-IDF achieves 78.8% accuracy.

Key words: consumption intention detection, intention object extraction, transfer learning, attention mechanism