Artigo Revisado por pares

A voice-driven scene-mode recommendation service for portable digital imaging devices

2009; Institute of Electrical and Electronics Engineers; Volume: 55; Issue: 4 Linguagem: Inglês

10.1109/tce.2009.5373726

ISSN

1558-4127

Autores

Yoo Rhee Oh, Jae Sam Yoon, Hong Kook Kim, Myung Bo Kim, Sang Ryong Kim,

Tópico(s)

Video Analysis and Summarization

Resumo

In this paper, we propose a voice-driven scenemode recommendation service in order to more easily select scene-modes on portable digital imaging devices such as digital cameras and camcorders. In other words, the proposed service is designed to recommend or automatically change the scene-mode by recognizing a user's voice command regarding scene or scene-related words. To realize such a service, we implement a system which is mainly composed of voice activity detection, automatic speech recognition (ASR), utterance verification, and word-to-scene-mode mapping. However, several optimization methods should be applied since portable digital imaging devices operate on embedded systems with limited resources. In addition, a speech adaptation database for acoustic models is developed such that the ASR system can adjust to the characteristics of the microphones and operating environments. Finally, the performance of the voice-driven scene-mode recommendation system is measured in terms of processing time and scenemode recognition accuracy (SMRA). It is shown from the experiments that the average processing time and the average SMRA are around 500 ms and 98.0% for 50 scene-related words, respectively, and 1200 ms and 96.8% for 200 scenerelated words. 1

Referência(s)