Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2016, Vol. 52 ›› Issue (1): 127-133.DOI: 10.13209/j.0479-8023.2016.013

Previous Articles     Next Articles

A Selectional Preference Based Translation Model for SMT

TANG Haiqing, XIONG Deyi   

  1. School of Computer Science and Technology, Soochow University, Suzhou 215006
  • Received:2015-06-19 Online:2016-01-20 Published:2016-01-20
  • Contact: XIONG Deyi, E-mail: dyxiong(at)suda.edu.cn

基于选择偏向性的统计机器翻译模型

唐海庆, 熊德意   

  1. 苏州大学计算机科学与技术学院, 苏州 215006
  • 通讯作者: 熊德意, E-mail: dyxiong(at)suda.edu.cn
  • 基金资助:
    国家自然科学基金青年基金(61403269)和江苏省自然科学基金青年基金(BK20140355)资助

Abstract:

The limited semantic knowledge is used in the phrase-based statistical machine translation (SMT), which causes that the translation quality of long-distance verb and its object is low. A selectional preference based translation model is proposed, which inducts the semantic constraints that a verb imposes on its object to select the proper argument-head word for the predicate with long distance. The authors train the corpus to obtain the conditional probability based selectional preferences for verb, and integrate the selectional preferences into a phrase-based translation system and evaluate on a Chinese-to-English translation task with large-scale training data. Experiment results show that the integration of selectional preference into SMT can effectively capture the long-distance semantic dependencies and improve the translation quality. 

Key words: semantic knowledge, selectional preference, semantic constraints, semantic dependencies

摘要:

针对基于短语的统计机器翻译使用有限的语义知识, 导致长距离的动宾短语对翻译质量不高的问题, 提出基于动词选择偏向性的翻译模型, 引入动词对宾语的语义约束信息, 为动词找到合适的宾语翻译。首先使用条件概率方法, 训练动词对宾语的选择偏向性, 然后将选择偏向性作为一个新特征, 集成到基于短语的翻译系统中。在大规模测试数据集上完成汉语到英语的翻译, 实验结果表明, 基于选择偏向性的翻译模型能够很好地捕获长距离的语义依赖关系, 从而提高译文质量。

关键词: 语义知识, 选择偏向性, 语义约束, 语义依赖

CLC Number: