We use goal babbling, a recent approach to bootstrapping inverse models, for vowel acquisition. In contrast to motor babbling, goal babbling organizes exploration in a low-dimensional goal space. While such a goal space is naturally given in many motor learning tasks, the difficulty in modeling speech production lies within the complexity of acoustic features. Often, the first and second formants are used as low-dimensional features. However, formants cannot capture richer characteristics of acoustic signals.We propose to use high-dimensional acoustic features based on a cochlea model and apply dimension reduction in order to generate a low-dimensional goal space. Instead of pre-defining targets in this goal space, we estimate a target distribution from ambient speech with a Gaussian Mixture Model. We demonstrate that goal babbling can be successfully applied in this goal space in order to learn a parametric model of vowel production specialized to a set of ambient speech sounds. By augmenting the goal-directed exploration along linear paths with an active selection of targets, we achieve a significant speed up in learning.