Turbo Automatic Speech Recognition

Artigo Revisado por pares

Turbo Automatic Speech Recognition

2016; Institute of Electrical and Electronics Engineers; Volume: 24; Issue: 5 Linguagem: Inglês

10.1109/taslp.2016.2520364

ISSN

2329-9304

Autores

Simon Receveur, Robin Weiß, Tim Fingscheidt,

Tópico(s)

Speech Recognition and Synthesis

Resumo

Performance of automatic speech recognition (ASR) systems can significantly be improved by integrating further sources of information such as additional modalities, or acoustic channels, or acoustic models. Given the arising problem of information fusion, striking parallels to problems in digital communications are exhibited, where the discovery of the turbo codes by Berrou et al. was a groundbreaking innovation. In this paper, we show ways how to successfully apply the turbo principle to the domain of ASR and thereby provide solutions to the abovementioned information fusion problem. The contribution of our work is fourfold: First, we review the turbo decoding forward-backward algorithm (FBA), giving detailed insights into turbo ASR, and providing a new interpretation and formulation of the so-called extrinsic information being passed between the recognizers. Second, we present a real-time capable turbo-decoding Viterbi algorithm suitable for practical information fusion and recognition tasks. Then we present simulation results for a multimodal example of information fusion. Finally, we prove the suitability of both our turbo FBA and turbo Viterbi algorithm also for a single-channel multimodel recognition task obtained by using two acoustic feature extraction methods. On a small vocabulary task (challenging, since spelling is included), our proposed turbo ASR approach outperforms even the best reference system on average over all SNR conditions and investigated noise types by a relative word error rate (WER) reduction of 22.4% (audio-visual task) and 18.2% (audio-only task), respectively.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Turbo Automatic Speech Recognition