Acta Scientiarum Naturalium Universitatis Pekinensis

Previous Articles     Next Articles

On the Importance of Components of the MFCC in Speech and Speaker Recognition

ZHEN Bin,WU Xihong,LIU Zhimin,CHI Huisheng   

  1. Center for Information Science, Peking University, Beijing, 100871
  • Received:2000-04-05 Online:2001-05-20 Published:2001-05-20

语音识别和说话人识别中各倒谱分量的相对重要性

甄斌,吴玺宏,刘志敏,迟惠生   

  1. 北京大学信息科学中心,100871,北京

Abstract: The analysis of the relative importance of components of MFCC for both speech recognition and speaker recognition using DTW recognizer in various noise environments are given. For English digit and under the Euclidean distance definition, the experiment results show cepstral components from C2 to C16contain the most useful speaker information, while C0 and C1 are usually harm to speaker recognition. Cepstral terms from C1 to C12 are found to contain the most useful speech information. In both tasks, the additive noise decreases the relative importance of low MFCC terms faster than that of the middle and high MFCC terms, and the decrement depends on the speech SNR. The channel distortion will deteriorate low terms more than the middle and high MFCC terms in both tasks, also.

Key words: MFCC, speech recognition, speaker recognition

摘要: 采用增减特征分量的方法研究了MFCC各维倒谱分量对说话人识别和语音识别的贡献。使用DTW测度,在标准英文数字语音库上的实验表明,最有用的语音信息包含在MFCC分量C1C12之间,最有用的说话人信息包含在MFCC分量C2C16之间。MFCC分量C0C1包含有负作用的说话人信息,将其作为特征会引起识别率的降低。低阶MFCC分量较高阶分量更容易受加性噪声和卷积噪声干扰。

关键词: MFCC, 说话人识别, 语音识别

CLC Number: