北京大学学报自然科学版 ›› 2017, Vol. 53 ›› Issue (2): 197-203.DOI: 10.13209/j.0479-8023.2017.028

   下一篇

一种利用语义相似度改进问答摘要的方法

应文豪1, 肖欣延2, 李素建1,(), 吕雅娟2, 穗志方1,()   

  1. 1. 北京大学信息科学技术学院, 北京 100871
    2. 百度公司, 北京 100085
  • 收稿日期:2016-07-30 修回日期:2016-10-05 出版日期:2017-03-20 发布日期:2017-03-20
  • 通讯作者: 李素建,穗志方
  • 基金资助:
    百度-北京大学合作项目、国家重点基础研究发展计划项目(2014CB340504)和国家自然科学基金(61273278, 61375074)资助

Improving Query-Focused Summarization with CNN-Based Similarity

Wenhao YING1, Xinyan XIAO2, Sujian LI1,(), Yajuan LÜ2, Zhifang SUI1,()   

  1. 1. School of Electronic Engineering and Computer Science, Peking University, Beijing 100871
    2. Baidu Inc., Beijing 100085;
  • Received:2016-07-30 Revised:2016-10-05 Online:2017-03-20 Published:2017-03-20
  • Contact: Sujian LI,Zhifang SUI

摘要:

在搜索引擎中对用户问题直接给出简要的答案(即答案摘要)可以帮助用户更快捷的获取信息。针对这一任务, 设计一种基于特征的答案摘要抽取方法。为了进行句子相似性的计算, 提出通过使用卷积神经网络表示句子语义和计算相似性, 同时给出基于最大间隔学习的网络训练方法。在百度知道问答语料上的实验结果表明, 所提出的答案摘要抽取方法能够生成质量良好的简短回答。与基于词袋的相似性计算相比, 使用卷积神经网络能够更好地描述句子语义, 计算问题和句子之间的相似性, 有效地改善答案摘要的质量。

关键词: 问答摘要, 语义相似度计算, 最大间隔学习, 卷积神经网络

Abstract:

In search services, users can get information more conveniently by reading the succinct answers to their questions. This paper introduces a feature-based method for the query-focused summarization to extract the answer summary of a user query. A convolutional neural network (CNN) is used to learn the semantic representation of a sentence, by which the similarity between a candidate answer sentence and a user query is evaluated. The neural network is trained under the framework of max-margin learning. Experiments in Baidu Knows verify that the proposed method can generate the concise answer of a user query.

Key words: query-focused summarization, semantic similarity, max-margin learning, convolutional neural network

中图分类号: