Acta Scientiarum Naturalium Universitatis Pekinensis

Previous Articles     Next Articles

Research on Fast Incremental Training Algorithm for Word Alignment

LUO Wei   

  1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190;
  • Received:2012-06-05 Online:2013-01-20 Published:2013-01-20

词语对齐的快速增量式训练方法研究

罗维   

  1. 中国科学院计算技术研究所, 北京 100190;

Abstract: This study puts emphasis on the incremental training algorithm for word alignment, which is the bottleneck during the construction of translation model. Based on two unsupervised word alignment models, the author proposes an incremental training algorithm which is based on initialization and online EM algorithm. Experiments show that the proposed method is efficient and would not hurt the quality of word alignment and translation.

Key words: statistical machine translation, word alignment, incremental training, expectation maximization, online algorithm

摘要: 围绕翻译模型构建流程的瓶颈??词语对齐, 着手翻译模型的增量式训练。在基于无监督学习的词语对齐模型的基础上, 提出一种基于初始化同时应用迭代训练收敛速度更快的online EM算法, 以替换通常所用的batch EM算法, 实现增量式训练。实验表明, 所提出的方法既高效又能保证词语对齐质量和机器翻译质量。

关键词: 统计机器翻译, 词语对齐, 增量式训练, 期望最大化, 在线算法

CLC Number: