Titelaufnahme
Titelaufnahme
- TitelLearning How to Speak: Imitation-Based Refinement of Syllable Production in an Articulatory-Acoustic Model
- Verfasser
- Erschienen
- SpracheEnglisch
- DokumenttypKonferenzband
- URN
- DOI
Zugriffsbeschränkung
- Das Dokument ist frei verfügbar
Links
- Social MediaShare
- NachweisKein Nachweis verfügbar
- IIIF
Dateien
Klassifikation
Abstract
This paper proposes an efficient neural network model for learning the articulatory-acoustic forward and inverse mapping of consonant-vowel sequences including coarticulation effects. It is shown that the learned models can generalize vowels as well as consonants to other contexts and that the need for supervised training examples can be reduced by refining initial forward and inverse models using acoustic examples only. The models are initially trained on smaller sets of examples and then improved by presenting auditory goals that are imitated. The acoustic outcomes of the imitations together with the executed actions provide new training pairs. It is shown that this unsupervised and imitation-based refinement significantly decreases the error of the forward as well as the inverse model. Using a state-of-the-art articulatory speech synthesizer, our approach allows to reproduce the acoustics from learned articulatory trajectories, i.e. we can listen to the results and rate their quality by error measures and perception.
Statistik
- Das PDF-Dokument wurde 10 mal heruntergeladen.