Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2016, Vol. 52 ›› Issue (1): 113-119.DOI: 10.13209/j.0479-8023.2016.007
Previous Articles Next Articles
LI Qiang1, LI Mu2, ZHANG Dongdong2, ZHU Jingbo1#br# #br#
Received:
Online:
Published:
Contact:
李强1, 李沐2, 张冬冬2, 朱靖波1#br#
通讯作者:
基金资助:
Abstract:
Abstract Due to the sparsity of data and the limitation of bilingual data size, many high-quality phrase pairs can’t be generated. The example-based phrase pairs proposed by the authors are generated through decomposing, substituting and generating the typical phrase pairs, and the typical phrase pairs are generated by the typical phrase extraction method in phrase-based statistical machine translation. On the Chinese-to-English Newswire and Oral translation tasks, the experimental results demonstrate significant improvements achieved by the proposed methods. Moreover, a gain of about 1% BLEU score increase is yielded on some test sets.
Key words: statistical machine translation, phrase-based, example-based, phrase pair
摘要:
针对由于数据的稀疏性和双语数据规模的局限性造成的大量高质量短语对没有生成的问题, 在基于短语的统计机器翻译系统中, 通过对传统短语抽取算法抽取的短语对进行分解、替换、生成等操作, 生成传统方法无法抽取的实例短语对。在汉英新闻和汉英口语翻译任务上, 与基线系统相比, 该方法在多个测试集上明显提高了翻译系统的翻译质量, 在部分测试集上BLEU 值可提高1%左右。
关键词: 统计机器翻译, 基于短语, 基于实例, 短语对
CLC Number:
TP391
LI Qiang, LI Mu, ZHANG Dongdong, ZHU Jingbo. Research on Example-Based Phrase Pairs in Statistical Machine Translation[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(1): 113-119.
李强, 李沐, 张冬冬, 朱靖波. 统计机器翻译中实例短语对研究[J]. 北京大学学报(自然科学版), 2016, 52(1): 113-119.
Add to citation manager EndNote|Ris|BibTeX
URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2016.007
https://xbna.pku.edu.cn/EN/Y2016/V52/I1/113