Learning How to Speak: Imitation-Based Refinement of Syllable Production in an Articulatory-Acoustic Model

Titelaufnahme

Titel
Learning How to Speak: Imitation-Based Refinement of Syllable Production in an Articulatory-Acoustic Model
Verfasser
Philippsen, Anja ; Reinhart, Felix ; Wrede, Britta
Erschienen
2014
Sprache
Englisch
Dokumenttyp
Konferenzband
URN
urn:nbn:de:0070-pub-27002644
DOI
10.1109/DEVLRN.2014.6982981

Zugriffsbeschränkung

Links

Dateien

Klassifikation

Abstract

This paper proposes an efficient neural network model for learning the articulatory-acoustic forward and inverse mapping of consonant-vowel sequences including coarticulation effects. It is shown that the learned models can generalize vowels as well as consonants to other contexts and that the need for supervised training examples can be reduced by refining initial forward and inverse models using acoustic examples only. The models are initially trained on smaller sets of examples and then improved by presenting auditory goals that are imitated. The acoustic outcomes of the imitations together with the executed actions provide new training pairs. It is shown that this unsupervised and imitation-based refinement significantly decreases the error of the forward as well as the inverse model. Using a state-of-the-art articulatory speech synthesizer, our approach allows to reproduce the acoustics from learned articulatory trajectories, i.e. we can listen to the results and rate their quality by error measures and perception.

Statistik