北京大学学报(自然科学版)

基于翻译日志的统计机器翻译模型剪枝

刘凯1,吕雅娟1,姜文斌1,刘群1,2   

  1. 1. 中国科学院大学计算技术研究所, 智能信息处理重点实验室, 北京 100190; 2. Dublin City University DCU, Dublin 9;
  • 收稿日期:2013-06-15 出版日期:2014-01-20 发布日期:2014-01-20

Statistical Machine Translation Model Pruning Based on Translation Log

LIU Kai1, Lü Yajuan1, JIANG Wenbin1, LIU Qun1,2   

  1. 1. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, University of Chinese Academy of Sciences, Beijing 100190; 2. Dublin City University DCU, Dublin 9;
  • Received:2013-06-15 Online:2014-01-20 Published:2014-01-20

摘要: 提出一种基于翻译日志的统计机器翻译模型的剪枝方法。该方法利用翻译规则在翻译日志中的命中频数对机器翻译规则进行过滤, 保留当前机器翻译模型所需的最小规则表。实验表明, 该方法能够在仅保留原有模型1%~3%翻译规则的前提下达到原有模型的翻译效果。

关键词: 统计机器翻译, 模型剪枝, 翻译日志

Abstract: The authors propose a novel translation log based translation rule pruning method, which prunes translation rules according to the translation rule hit counts pairs. Experiment results show that the proposed method requires only 1% - 3% translation rules without significantly difference compared to the full model.

Key words: statistical machine translation, model pruning, translation log

中图分类号: