北京大学学报(自然科学版)

基于HMM/MLFNN混合结构的说话人辨认研究

包威权, 陈珂, 迟惠生   

  1. 北京大学信息科学中心,北京,100871
  • 收稿日期:1996-10-23 出版日期:1997-05-20 发布日期:1997-05-20

A Hybrid Architecture Based on HMM/MLFNN for Speaker Identification

BAO Weiquan, CHEN Ke, CHI Huisheng   

  1. National Laboratory on Machine Perception, Center for Information Sciences of Peking University, Beijing, 100871
  • Received:1996-10-23 Online:1997-05-20 Published:1997-05-20

摘要: 将隐马尔可夫模型(HMM)与人工神经网络(ANN)相结合,既利用HMM能够较好地描述动态时间序列又利用ANN静态分类能力强的特点,应用于说话人辨认。本文将一个多层前馈神经网络(MLFNN)与HMM相结合构成混合模型,与以往的方法不同,具有所需训练数据量小,推广性能良好的特点。对20个说话人辨认的实验结果表明,混合模型优于单一的HMM的性能。

关键词: 说话人辨认, 隐马尔可夫模型(HMM), 多层前馈神经网络(MLFNN)

Abstract: A hybrid architecture is presented, which is based on Hidden Markov Models (HMM) and Artificial Neural Networks (ANN). The HMM provides a good probabilistic representation for temporal sequences while the ANN has a powerful ability of static classification, so the integration of Multilayer Feed-forward Neural Networks (MLFNN) and HMM can give a good hybrid architecture for speaker identification. Unlike most of previous methods of hybrid HMM/MLP, this architecture needs only small amount of data for training and it is also good in generalization. Using this architecture, some experiments of speaker identification have been conducted with better performance than that of single HMM.

Key words: speaker identification, hidden markov models (HMMs), multilayer feed-forward neural network (MLFNN)

中图分类号: