Acta Scientiarum Naturalium Universitatis Pekinensis

Previous Articles     Next Articles

A Study on Prosodic Boundaries Location and Synthesized Units Selection Algorithms in Mandarin Speech Synthesis

CHENG Yong, WU Xihong, CHI Huisheng   

  1. National Key Lab. On machine Perception, Peking University, Beijing, 100871
  • Received:2003-11-17 Online:2004-05-20 Published:2004-05-20


程勇, 吴玺宏, 迟惠生   

  1. 北京大学信息科学技术学院智能科学系,视觉与听觉信息处理国家重点实验室,北京,100871;E-mail: {Chengy, Wxh};

Abstract: A new statistical prosodic structure model is proposed, which is based on the idea of analyzing and modeling of hierarchical stochastic properties of Chinese mandarin, where three basic levels of prosodic structure are divided as: prosodic word, prosodic phrase, prosodic phrase cluster. Meanwhile, synthesized units selection algorithms, which are suited for large-corpus-based speech synthesis, are described and discussed in this paper. The experimental results show that the proposed model is effective and high performance could be obtained.

Key words: speech synthesis, prosodic structure model, prosodic boundary, synthesized units selection algorithm

摘要: 论述了采用统计模型进行汉语韵律层次结构分析和韵律建模的思路,将韵律结构划分为3个基本层级:韵律词,韵律短语和韵律短语群,提出了一种新的基于统计的韵律结构模型。实验表明该模型对韵律词边界的预测准确率和召回率分别达90.37%和92.48%:对韵律短语边界的预测准确率和召回率分别达82.43%和85.59%。同时,描述了一个汉语连续语流语音合成的选音算法,它适用于基于大语料库的语音合成系统。由于同时考虑单音节、二字韵律词、三字韵律词和四字韵律词,从而降低了因拼接点不连续而造成的音质损失,提高了合成语音的自然度。

关键词: 语音合成, 韵律结构模型, 韵律边界, 选音算法

CLC Number: