To be sociable, embodied interactive agents like virtual characters or humanoid robots need to be able to engage in mutual coordination of behaviors, beliefs, and relationships with their human interlocutors. We argue that this requires them to be capable of flexible multimodal expressiveness, incremental perception of other’s behaviors, and the integration and interaction of these models in unified sensorimotor structures. We present work on probabilistic models for these three requirements with a focus on gestural behavior.