In order to explore intuitive verbal and non-verbal interfaces in smart environments we recorded user interactions with an intelligent apartment.
Besides offering various interactive capabilities itself, the apartment is also inhabited by a social robot that is available as a humanoid interface.
This paper presents a multi-modal corpus that contains goal-directed actions of naive users in attempts to solve a number of predefined tasks.
Alongside audio and video recordings, our data-set consists of large amount of temporally aligned sensory data and system behavior provided by the environment and its interactive components.
Non-verbal system responses such as changes in light or display contents, as well as robot and apartment utterances and gestures serve as a rich basis for later in-depth analysis.
Manual annotations provide further information about meta data like the current course of study and user behavior including the incorporated modality, all literal utterances, language features, emotional expressions, foci of attention, and addressees.