Perceptual similarity measurement of speech by combination of acoustic features

Artigo

Perceptual similarity measurement of speech by combination of acoustic features

2008; Institute of Electrical and Electronics Engineers; Linguagem: Inglês

10.1109/icassp.2008.4518746

ISSN

2379-190X

Autores

Yoshihiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura,

Tópico(s)

Music and Audio Processing

Resumo

Future cast system is a new entertainment system where participant's face is captured and rendered into the movie as an instant Computer Graphics (CG) movie star, which had been first exhibited at the 2005 World Exposition in Aichi Japan. We are working to add new functionality which enables mapping not only faces but also speech individualities to the cast. Our approach is to find a speaker with the closest speech individuality and apply voice conversion. This paper investigates acoustic features to estimate perceptual similarity of speech individuality. We propose a method linearly combined eight acoustic features related to the perception of speech individualities. The proposed method optimizes weights for the acoustic features considering perceptual similarities. We have evaluated performance of our method with Spearman's rank correlation coefficients to perceptual similarities. As the results, the experiments evidenced that the proposed method achieves a correlation coefficient of 0.66.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Perceptual similarity measurement of speech by combination of acoustic features