Speech emotion recognition based on formant characteristics feature extraction and phoneme type convergence
Authors: Liu, Z.T., Rehman, A., Wu, M., Cao, W.H. and Hao, M.
Journal: Information Sciences
Volume: 563
Pages: 309-325
ISSN: 0020-0255
DOI: 10.1016/j.ins.2021.02.016
Abstract:Speech Emotion Recognition (SER) has numerous applications including human-robot interaction, online gaming, and health care assistance. While deep learning-based approaches achieve considerable precision, they often come with high computational and time costs. Indeed, feature learning strategies must search for important features in a large amount of speech data. In order to reduce these time and computational costs, we propose pre-processing step in which speech segments with similar formant characteristics are clustered together and labeled as the same phoneme. The phoneme occurrence rates in emotional utterances are then used as the input features for classifiers. Using six databases (EmoDB, RAVDESS, IEMOCAP, ShEMO, DEMoS and MSP-Improv) for evaluation, the level of accuracy is comparable to that of current state-of-the-art methods and the required training time was significantly reduced from hours to minutes.
Source: Scopus
Speech emotion recognition based on formant characteristics feature extraction and phoneme type convergence q
Authors: Liu, Z.-T., Rehman, A., Wu, M., Cao, W.-H. and Hao, M.
Journal: INFORMATION SCIENCES
Volume: 563
Pages: 309-325
eISSN: 1872-6291
ISSN: 0020-0255
DOI: 10.1016/j.ins.2021.02.016
Source: Web of Science (Lite)