On the Importance of Components of the MFCC in Speech and Speaker Recognition

Acta Scientiarum Naturalium Universitatis Pekinensis

Previous Articles Next Articles

On the Importance of Components of the MFCC in Speech and Speaker Recognition

ZHEN Bin，WU Xihong，LIU Zhimin，CHI Huisheng

Center for Information Science, Peking University, Beijing, 100871

Received:2000-04-05 Online:2001-05-20 Published:2001-05-20

语音识别和说话人识别中各倒谱分量的相对重要性

甄斌,吴玺宏,刘志敏,迟惠生

北京大学信息科学中心，100871，北京

Abstract

Abstract: The analysis of the relative importance of components of MFCC for both speech recognition and speaker recognition using DTW recognizer in various noise environments are given. For English digit and under the Euclidean distance definition, the experiment results show cepstral components from C₂ to C₁₆contain the most useful speaker information, while C₀ and C₁ are usually harm to speaker recognition. Cepstral terms from C₁ to C₁₂ are found to contain the most useful speech information. In both tasks, the additive noise decreases the relative importance of low MFCC terms faster than that of the middle and high MFCC terms, and the decrement depends on the speech SNR. The channel distortion will deteriorate low terms more than the middle and high MFCC terms in both tasks, also.

Key words: MFCC, speech recognition, speaker recognition

摘要： 采用增减特征分量的方法研究了MFCC各维倒谱分量对说话人识别和语音识别的贡献。使用DTW测度，在标准英文数字语音库上的实验表明，最有用的语音信息包含在MFCC分量C₁到C₁₂之间，最有用的说话人信息包含在MFCC分量C₂到C₁₆之间。MFCC分量C₀和C₁包含有负作用的说话人信息，将其作为特征会引起识别率的降低。低阶MFCC分量较高阶分量更容易受加性噪声和卷积噪声干扰。

关键词: MFCC, 说话人识别, 语音识别

CLC Number:

ZHEN Bin,WU Xihong,LIU Zhimin,CHI Huisheng. On the Importance of Components of the MFCC in Speech and Speaker Recognition[J]. Acta Scientiarum Naturalium Universitatis Pekinensis.

甄斌,吴玺宏,刘志敏,迟惠生. 语音识别和说话人识别中各倒谱分量的相对重要性[J]. 北京大学学报（自然科学版）.

Add to citation manager EndNote|Ris|BibTeX

URL: https://xbna.pku.edu.cn/EN/

https://xbna.pku.edu.cn/EN/Y2001/V37/I3/371

[1]	ZHOU Nan, ZHAO Yue, LI Yaoqiang, XU Xiaona, CAIWANG Lamu, WU Licheng. Study on Continuous Speech Recognition Based on Bottleneck Features for Lhasa-Tibetan Dialect [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 249-254.
[2]	CHEN Weiliang,SUN Xiao. Mandarin Speech Emotion Recognition Based on MFCCG-PCA [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 269-274.
[3]	Nurmemet Yolwas,Wushour Silamu,Reyiman Tursun. Research on Recognition Units of Large Vocabulary Speech Recognition System of Uyghur [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2014, 50(1): 149-152.
[4]	WANG Wei,LIU Feng,WU Shuzhen. A Study for the Application of RASTA on Objective Communication Speech Quality Evaluation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2003, 39(5): 697-702.
[5]	WU Suzhen,WU Ahua. A Study of Parameters on Speaker Recognition and Creation of Speech Database [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 1995, 31(3): 316-322.

On the Importance of Components of the MFCC in Speech and Speaker Recognition

语音识别和说话人识别中各倒谱分量的相对重要性

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Recommended Articles

Metrics