Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2017, Vol. 53 ›› Issue (2): 279-286.DOI: 10.13209/j.0479-8023.2017.038

• Orginal Article • Previous Articles     Next Articles

Exploit Comparable Corpus to Chinese Zero Pronoun Resolution

Ziyi YANG, Zhengxian GONG, Fang KONG(), Guodong ZHOU   

  1. Natural Language Processing Laboratory, School of Computer Science and Technology, Soochow University, Suzhou 215006
  • Received:2016-07-21 Revised:2016-10-03 Online:2017-03-20 Published:2017-03-20
  • Contact: Fang KONG

基于中英文可比较语料的中文零指代消解

杨紫怡, 贡正仙, 孔芳(), 周国栋   

  1. 苏州大学计算机科学与技术学院, 自然语言处理实验室, 苏州 215006
  • 通讯作者: 孔芳
  • 基金资助:
    国家自然科学基金(61333018, 61472264, 61305088)资助

Abstract:

A bilingual approach based on a comparable corpus is proposed to better detect and to resolve Chinese zero pronouns. The concept of English equivalent sentence is defined firstly. Then the equivalent sentence is employed to redefine the distance between sentences and to extract bilingual word alignment features. In this way, both zero pronoun detection and resolution of the baseline system from bilingual perspective are improved. The experiments conducted on the OntoNotes5.0 corpus show that the proposed approach can significantly outperform the state-of-the-art system.

Key words: Chinese zero pronoun, bilingual, equivalent sentence, detection, resolution

摘要:

针对中文篇章中的零指代问题, 提出一种基于中英文可比较语料进行中文零指代识别和消解的方法, 并提出英文对等句的概念。利用对等句, 重新定义句子间隔, 并引入双语词对齐特征。在基准平台基础上, 从零指代项识别和零指代项消解两个方面进行研究。在 OntoNotes5.0 语料上的实验结果表明, 与目前性能最好的系统相比, 新提出的基于中英对等语料的中文零指代方法取得更好的性能。

关键词: 中文零指代, 双语, 对等句, 识别, 消解

CLC Number: