Combining Acoustic Features for Improved Emotion Recognition in Mandarin Speech
2005; Springer Science+Business Media; Linguagem: Inglês
10.1007/11573548_36
ISSN1611-3349
AutoresTsang-Long Pao, Yu-Te Chen, Jun-Heng Yeh, Wenyuan Liao,
Tópico(s)Speech Recognition and Synthesis
ResumoCombining different feature streams to obtain a more accurate experimental result is a well-known technique. The basic argument is that if the recognition errors of systems using the individual streams occur at different points, there is at least a chance that a combined system will be able to correct some of these errors by reference to the other streams. In the emotional speech recognition system, there are many ways in which this general principle can be applied. In this paper, we proposed using feature selection and feature combination to improve the speaker-dependent emotion recognition in Mandarin speech. Five basic emotions are investigated including anger, boredom, happiness, neutral and sadness. Combining multiple feature streams is clearly highly beneficial in our system. The best accuracy recognizing five different emotions can be achieved 99.44% by using MFCC, LPCC, RastaPLP, LFPC feature streams and the nearest class mean classifier.
Referência(s)