According to usage-based approaches to language
acquisition, linguistic knowledge is represented in the form of constructions as pairings of meaning and form at multiple levels of abstraction and complexity. The emergence of syntactic knowledge in infants is assumed to be a result of the gradual abstraction of lexically specific and item-based knowledge. In this paper, we present a computational usage-based model accounting for the gradual emergence of a network consisting of
constructions at varying degrees of complexity given ambiguous input examples of phoneme sequences coupled with a symbolic representation of the visual context. We provide empirical results on the RoboCup dataset, showing that the model can acquire a compact construction grammar which generalizes successfully to unseen data in an online fashion, with one pass over the data.