北京大学学报(自然科学版)

一种孤立词语音识别方法研究

吴淑珍1, 程乾生2   

  1. 1北京大学电子学系,北京,100871; 2北京大学数学科学学院,北京,100871
  • 收稿日期:1999-11-12

A Study on Speech Recognition for Isolate Words

WU Shuzhen1, CHENG Qiansheng2   

  1. 1Department of Electronics, Peking University, Beijing, 100871; 2School of Mathematical Sciences, Peking University, Beijing, 100871
  • Received:1999-11-12

摘要: 结合动态谱特性的语音识别研究,阐述了一种有限状态矢量量化(FSVQ)方法。FSVQ利用了过去的信息来选择合适的码本进行编码,对于语音识别更为有效。改进了所使用的语音特征参量,除了LPC倒谱系数外,结合使用了动态谱特征和能量的对数值,并根据汉语发音特征对语音信号端点进行一种加权处理。实验结果表明:与说话人有关的孤立词识别率达到98%。

关键词: 有限状态矢量量化, LPC倒谱系数, 动态谱特性, 动态规整, 状态转移函数

Abstract: A speech recognition method is described, that is based on a combination of finite-state vector quantization(FSVQ) and dynamic spectral features. FSVQ is a recallable vector quantization system, which also uses past information for optimizing the code book, and is more effective for speech recognition. The characteristics of a speech signal are represented by time sequences of LPC cepstral coefficients, the dynamic spectral features and log-energy. According to pronunciation feature of Mandarin, the distance values were weighted for the parts of word termination. The experimental results show that the depended speaker speech recognition rate is 98%.

Key words: finite-state vector quantization, LPC cepstral coefficients, dynamic spectral feature, dynamic time warpping, state transition function

中图分类号: