北京大学学报自然科学版 ›› 2023, Vol. 59 ›› Issue (5): 773-781.DOI: 10.13209/j.0479-8023.2022.121

上一篇    下一篇

藏语情感语音数据库构建

彭毛扎西1,2,3, 才智杰1,2,†, 才让卓玛4   

  1. 1. 青海师范大学计算机学院, 西宁 810016 2. 省部共建藏语智能信息处理及应用国家重点实验室, 西宁 810008 3. 西宁大学计算机与信息科学学院, 西宁 810022 4. 西南民族大学计算机科学与技术学院, 成都 610041
  • 收稿日期:2022-09-23 修回日期:2022-10-22 出版日期:2023-09-20 发布日期:2023-09-18
  • 通讯作者: 才智杰, E-mail: czjqhsd(at)163.com
  • 基金资助:
    国家自然科学基金(61966031)、青海省科技厅项目(2019-SF-129)和青海省藏文信息处理与机器翻译重点实验室(2020-ZJ-Y05)资助

Construction of Tibetan Emotional Speech Database

PENGMAO Zhaxi1,2,3, CAI Zhijie1,2,†, CAIRANG Zhuoma4   

  1. 1. The College of Computer, Qinghai Normal University, Xining 810016 2. The State Key Laboratory of Tibetan Intelligent Information Processing and Application, Xining 810008 3. School of Computer and Information Science, Xining University, Xining 810022 4. School of Computer Science and Technology, Southwest Minzu University, Chengdu 610041
  • Received:2022-09-23 Revised:2022-10-22 Online:2023-09-20 Published:2023-09-18
  • Contact: CAI Zhijie, E-mail: czjqhsd(at)163.com

摘要:

针对目前藏语语音情感类型划分不够细致, 语音情感分析数据库规模较小的问题, 在分析汉、英等语言语音情感类型划分方案及数据库基础上, 提出一种藏语情感语音数据库构建方案, 包括藏语语音情感分类、情感语音采集、情感语音标注以及有效性分析等。根据此方案, 建立面向藏语语音情感分析的情感类型集(TESCS-9), 用录音法和剪辑法采集2786句藏语情感语音, 并对其进行标注, 利用改进的模糊综合评价法评估情感语音得到含2745句藏语情感语音数据库(TESDB-2745), 为藏语语音情感分析奠定了基础。

关键词: 语音信号处理, 藏语, 情感语音, 数据库

Abstract:

The classification of Tibetan speech emotion types is not detailed enough, and the database size of speech emotion analysis is also small. Based on the analysis of the classification of speech emotion types and database of Chinese, English and other languages, this paper designs a construction scheme of Tibetan emotional speech database, including Tibetan speech emotional classification, emotional speech collection, emotional speech tagging and effectiveness analysis, etc. According to this scheme, an emotion type set (TESCS-9) for Tibetan speech emotion analysis is established. 2786 Tibetan emotional speech is collected by recording and editing methods, and annotated. The improved fuzzy comprehensive evaluation method is used to evaluate the emotional speech, so as to obtain a 2745 Tibetan emotional speech database (TESDB-2745), laying the foundation for Tibetan speech emotion analysis.

Key words: speech signal processing, Tibetan, emotional speech, database