Artigo Acesso aberto Revisado por pares

A Japanese speech synthesizer based on production rules

1988; Acoustical Society of America; Volume: 84; Issue: S1 Linguagem: Inglês

10.1121/1.2026235

ISSN

1520-9024

Autores

Norio Higuchi, Seiichi Yamamoto, T. Shimizu,

Tópico(s)

Phonetics and Phonology Research

Resumo

A Japanese speech synthesizer by rule, which uses phonemes as synthesis units and generates all acoustic parameters based on production rules, has been developed. The conversion from the input romaji string in Hepburn style to the synthetic speech waveform consists of (1) the generation of the phoneme/boundary string with the distinctive feature matrix based on 156 feature rules, (2) the conversion to sequences of the acoustic parameters based on 472 parameter rules, and (3) the generation of the speech waveform using a Klatt-type formant synthesizer. The first two processes are written in C language and implemented by a microprocessor (M 68000) and the last one is implemented by a digital signal processor (TI TMS32010). Both male and female voices can be synthesized with three different accent levels at seven different speech rates in real time. Nine kinds of subjective evaluation, which include tests for intelligibility, naturalness, and other nonlinguistic factors, were proposed and applied to the speech generated with the above-mentioned speech synthesizer. According to the results, 87.8% of the morae of the male voice and 81.6% of the morae of the female voice were identified correctly by three male subjects and two female subjects.

Referência(s)