北京大学学报自然科学版 ›› 2024, Vol. 60 ›› Issue (1): 53-61.DOI: 10.13209/j.0479-8023.2023.078

上一篇    下一篇

集成显著性话语上下文窗口采样方法的长对话摘要生成模型

吴杰1, 王鹏鸣2,†, 熊正坤1   

  1. 1. 华东交通大学信息工程学院, 南昌 330013 2. 温州理工学院数据科学与人工智能学院, 温州 325035
  • 收稿日期:2023-05-18 修回日期:2023-08-01 出版日期:2024-01-20 发布日期:2024-01-20
  • 通讯作者: 王鹏鸣, E-mail: zhangwuji115(at)163.com
  • 基金资助:
    国家自然科学基金(62166018, 62266017)和江西省重点研发计划(20203BBE53029)资助

A Long Dialogue Summary Model Integrating Salience Discourse Context Window Sampling Methods

WU Jie1, WANG Pengming2,†, XIONG Zhengkun1   

  1. 1. School of Information Engineering, East China Jiaotong University, Nanchang 330013 2. School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325035
  • Received:2023-05-18 Revised:2023-08-01 Online:2024-01-20 Published:2024-01-20
  • Contact: WANG Pengming, E-mail: zhangwuji115(at)163.com

摘要:

针对对话语料的特点, 提出一种集成显著性话语上下文窗口采样方法的长对话摘要生成模型。该模型分为两个模块: 1) 显著性话语上下文窗口采样模块将对话话语进行显著性评估, 以显著性话语作为采样锚点, 然后设置采样窗口, 将采样锚点左右相邻的话语一起提取为片段, 提取出来的片段包含更丰富的话语关系; 2) 片段间信息融合摘要生成模块利用Transformer块, 将相互独立的片段进行信息融合, 加强片段之间的语义关系, 并且为片段在生成摘要期间分配混合权重。利用一致性损失机制, 鼓励显著性话语上下文窗口采样模块确定更佳的采样锚点。在基于查询的长对话摘要公开数据集QMSum上的实验结果表明, 该模型在ROUGE评估指标上的分数高于现有最好的模型。

关键词: 长对话摘要, 窗口采样, 显著性话语, 信息融合, 生成模型

Abstract:

A long dialogue summary generation model with integrated salience discourse context window sampling method (SDCWS) is proposed according to the characteristics of dialogue corpus. The model is divided into two modules. 1) The salience discourse context window sampling module (CWS) evaluates the dialogue discourse for salience, uses the salient discourse as the sampling anchor point, and then sets the sampling window to extract the discourse adjacent to the left and right of the sampling anchor point together as fragments, containing richer discourse relations. 2) The inter-fragment information fusion summary generation module (IF) uses the transformer block to fuse information from mutually independent fragments, enhancing the semantic relationships between fragments and assigning blended weights to fragments during summary generation. The loss-of-consistency mechanism is used to encourage the salience discourse context window sampling module to determine better sampling anchors. Experimental results on the publicly available query-based long conversation summary dataset QMSum show that scores of the proposed model are significantly higher than the best existing model on the ROUGE evaluation metric.

Key words: long dialogue summary, window sampling, salient discourse, information fusion, generating models