汉语隐式篇章关系识别

北京大学学报（自然科学版）

汉语隐式篇章关系识别

孙静¹,李艳翠^1,2,周国栋¹,冯文贺³

1. 苏州大学计算机科学与技术学院, 苏州 215006; 2. 河南科技学院信息工程学院, 新乡 453003; 3. 河南科技学院人文学院, 新乡 453003;

收稿日期:2013-06-22 出版日期:2014-01-20 发布日期:2014-01-20

Research of Chinese Implicit Discourse Relation Recognition

SUN Jing¹, LI Yancui^1,2, ZHOU Guodong¹, FENG Wenhe³

1. Department of Computer Science and Technology, Soochow University, Suzhou 215006; 2. School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003; 3. School of humanities, Henan Institute of Science and Technology, Xinxiang 453003;

Received:2013-06-22 Online:2014-01-20 Published:2014-01-20

摘要/Abstract

摘要： 采用一个自建的汉语篇章结构语料库(隐式关系占80%)进行隐式关系识别。语料中将篇章关系分成3个层次, 第一层包含因果、并列、转折、解说四大类。在此语料上, 利用上下文特征、词汇特征、依存树特征, 采用最大熵的分类方法对四大类关系进行识别。实验结果显示, 总正确率为62.15%, 其中并列类识别效果最好, F1值达到75.26%。

关键词: 篇章结构分析, 篇章关系, 隐式关系识别, 汉语篇章语料库

Abstract: The authors use a self-built Chinese Discourse Treebank (80% relations are implicit) to recognize implicit relations. In this corpus, discourse relations are divided into three layers, the first layer has four types: causality, coordination, transition and explanation. Based on this corpus, maximum entropy classifier is employed to identify four types relations with context, lexical and dependency parse features. Experimental results show that total accuracy is 62.15% and the identification effect of coordination is the best, F1 reaches 75.26%.

Key words: discourse parsing, discourse relation, implicit relation recognition, Chinese Discourse Treebank

中图分类号:

TP391

孙静,李艳翠,周国栋,冯文贺. 汉语隐式篇章关系识别[J]. 北京大学学报（自然科学版）.

SUN Jing,LI Yancui,ZHOU Guodong,FENG Wenhe. Research of Chinese Implicit Discourse Relation Recognition[J]. Acta Scientiarum Naturalium Universitatis Pekinensis.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://xbna.pku.edu.cn/CN/

https://xbna.pku.edu.cn/CN/Y2014/V50/I1/111

[1]	唐裕婷, 李艳斌, 刘露, 于中华, 陈黎. 面向细粒度隐式篇章关系识别的远距离监督特征学习算法[J]. 北京大学学报自然科学版, 2019, 55(1): 91-97.
[2]	涂眉,周玉,宗成庆. 基于最大熵的汉语篇章结构自动分析方法[J]. 北京大学学报（自然科学版）, 2014, 50(1): 125-132.