### Highlights
- We thoroughly assessed cognitive models of spatial language understanding.
- The models implement contrasting assumptions about the directionality of attention.
- We empirically tested predictions generated by the models.
- We found two new effects of object shape that informed modifications to the models.
- Both implemented directionalities of attention account equally well for the data.
### Abstract
Language and vision interact in non-trivial ways. Linguistically, spatial utterances are often asymmetrical as they relate more stable objects (reference objects) to less stable objects (located objects). Researchers have claimed that such linguistic asymmetry should also be reflected in the allocation of visual attention when people process a depicted spatial relation described by spatial language. More specifically, it was assumed that people move their attention from the reference object to the located object. However, recent theoretical and empirical findings challenge the directionality of this attentional shift. In this article, we present the results of an empirical study based on predictions generated by computational cognitive models implementing different directionalities of attention. Moreover, we thoroughly analyze the computational models. While our results do not favor any of the implemented directionalities of attention, we found that two unknown sources of geometric information affect spatial language understanding. We provide modifications to the computational models that substantially improve their performance on empirical data.