北京大学学报自然科学版 ›› 2021, Vol. 57 ›› Issue (1): 23-30.DOI: 10.13209/j.0479-8023.2020.082

上一篇    下一篇

基于细粒度可解释矩阵的摘要生成模型

王浩男1, 高扬1,3,†, 冯俊兰2, 胡珉2, 王惠欣2, 柏宇1   

  1. 1. 北京理工大学计算机学院, 北京 100081 2. 中国移动通信研究院, 北京 100032 3. 北京市海量语言信息处理与云计算应用工程技术研究中心, 北京 100081
  • 收稿日期:2020-06-08 修回日期:2020-08-07 出版日期:2021-01-20 发布日期:2021-01-20
  • 通讯作者: 高扬, E-mail: gyang(at)bit.edu.cn
  • 基金资助:
    教育部–中国移动科研基金(MCM20170302)资助

Abstractive Summarization Based on Fine-Grained Interpretable Matrix

WANG Haonan1, GAO Yang1,3,†, FENG Junlan2, HU Min2, WANG Huixin2, BAI Yu1   

  1. 1. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081 2. China Mobile Research Institute, Beijing 100032

    3. Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081

  • Received:2020-06-08 Revised:2020-08-07 Online:2021-01-20 Published:2021-01-20
  • Contact: GAO Yang, E-mail: gyang(at)bit.edu.cn

摘要:

针对摘要模型中总结并解释长篇上下文信息存在的困难, 提出一种基于细粒度可解释矩阵, 先抽取再生成的摘要模型(fine-grained interpretable matrix, FGIM), 提升长文本对显著度、更新性和相关度的可解释抽取能力, 引导系统自动生成摘要。该模型通过一个句对判别(pair-wise)抽取器对文章内容进行压缩, 捕获文章中心度高的句子, 将抽取后的文本与生成器相结合, 实现摘要生成。在生成端通过可解释的掩码矩阵, 控制生成摘要的内容属性, 在编码器端分别使用多层Transformer和预训练语言模型BERT来验证其适用性。在标准文本摘要数据集(CNN/DailyMail和NYT50)上的实验表明, 所提模型的ROUGE指标和人工评估结果均优于当前最好的基准模型。实验中还构建两个测试数据集来验证摘要的更新度和相关度, 结果表明所提模型在可控生成方面取得相应的提升。

关键词: 生成式摘要, 可解释抽取, 中心度, 掩码矩阵, 可控生成

Abstract:

According to the great challenge of summarizing and interpreting the information of a long article in the summary model. A summary model (Fine-Grained Interpretable Matrix, FGIM), which is retracted and then generated, is proposed to improve the interpretability of the long text on the significance, update and relevance, and then guide to automatically generate a summary. The model uses a pair-wise extractor to compress the content of the article, capture the sentence with a high degree of centrality, and uses the compressed text to combine with the generator to achieve the process of generating the summary. At the same time, the interpretable mask matrix can be used to control the direction of digest generation at the generation end. The encoder uses two methods based on Transformer and BERT respectively. This method is better than the best baseline model on the benchmark text summary data set (CNN/DailyMail and NYT50). The experiment further builds two test data sets to verify the update and relevance of the abstract, and the proposed model achieves corresponding improvements in the controllable generation of the data set.

Key words: abstractive summarization, interpretable extraction, centrality, mask matrix, controllable