Capítulo de livro Revisado por pares

Combining Acoustic Features for Improved Emotion Recognition in Mandarin Speech

2005; Springer Science+Business Media; Linguagem: Inglês

10.1007/11573548_36

ISSN

1611-3349

Autores

Tsang-Long Pao, Yu-Te Chen, Jun-Heng Yeh, Wenyuan Liao,

Tópico(s)

Speech Recognition and Synthesis

Resumo

Combining different feature streams to obtain a more accurate experimental result is a well-known technique. The basic argument is that if the recognition errors of systems using the individual streams occur at different points, there is at least a chance that a combined system will be able to correct some of these errors by reference to the other streams. In the emotional speech recognition system, there are many ways in which this general principle can be applied. In this paper, we proposed using feature selection and feature combination to improve the speaker-dependent emotion recognition in Mandarin speech. Five basic emotions are investigated including anger, boredom, happiness, neutral and sadness. Combining multiple feature streams is clearly highly beneficial in our system. The best accuracy recognizing five different emotions can be achieved 99.44% by using MFCC, LPCC, RastaPLP, LFPC feature streams and the nearest class mean classifier.

Referência(s)