北京大学学报(自然科学版)

面向自动文摘的主题划分方法

童毅见,唐慧丰   

  1. 解放军外国语学院语言工程系, 洛阳 471003;
  • 收稿日期:2012-06-02 出版日期:2013-01-20 发布日期:2013-01-20

Topic Partition for Automatic Summarization

TONG Yijian, TANG Huifeng   

  1. Department of Language Engineering, PLA University of Foreign Language, Luoyang 471003;
  • Received:2012-06-02 Online:2013-01-20 Published:2013-01-20

摘要: 对当前主题划分方法进行了分类, 对主题划分算法TextSegFault (TSF)做了相关改进。根据文本的类型, 从TSF算法和改进的TSF算法中选择其一来进行主题划分, 以适应自动文摘任务的需要。实验结果表明, 引入本文的主题划分方法能有效地解决传统自动文摘方法造成的主题确实和主要主题冗余的问题, 使文摘的结构平衡化。

关键词: 主题划分, 自动文摘, TSF算法

Abstract: Current topic partition algorithm was summarized and classified. The authors improved one of the most effective topic partition algorithm?the TextSegFault (TSF) algorithm, and used TSF or its variant to partition topics based on the type of the text in order to meet the need of automatic summarization. Results show that the proposed method can help avoid the loss of minor topic or topic redundancy brought about by using traditional ways in automatic summarization, thus lead to the balanced structure of the summary.

Key words: topic partition, automatic summarization, TSF algorithm

中图分类号: