This thesis provides an approach on facilitating the analysis of nonverbal behaviour during human-human interaction. Thereby, much of the work that researchers do starting with experiment control, data acquisition, tagging and finally the analysis of the data is alleviated. For this, software and hardware techniques are used as sensor technology, machine learning, object tracking, data processing, visualisation and Augmented Reality. These are combined into an Augmented-Reality-enabled Interception Interface (ARbInI), a modular wearable interface for two users. The interface mediates the users’ interaction thereby intercepting and influencing it.
The ARbInI interface consists of two identical setups of sensors and displays, which are mutually coupled. Combining cameras and microphones with sensors, the system offers to record rich multimodal interaction cues in an efficient way. The recorded data can be analysed online and offline for interaction features (e. g. head gestures in head movements, objects in joint attention, speech times) using integrated machine-learning approaches. The classified features can be tagged in the data.
For a detailed analysis, the recorded multimodal data is transferred automatically into file bundles loadable in a standard annotation tool where the data can be further tagged by hand. For statistic analyses of the complete multimodal corpus, a toolbox for use in a standard statistics program allows to directly import the corpus and to automate the analysis of multimodal and complex relationships between arbitrary data types.
When using the optional multimodal Augmented Reality techniques integrated into ARbInI, the camera records exactly what the participant can see and nothing more or less. The following additional advantages can be used during the experiment: (a) the experiment can be controlled by using the auditory or visual displays thereby ensuring controlled experimental conditions, (b) the experiment can be disturbed, thus offering to investigate how problems in interaction are discovered and solved, and (c) the experiment can be enhanced by interactively comprising the behaviour of the user thereby offering to investigate how users cope with novel interaction channels.
This thesis introduces criteria for the design of scenarios in which interaction analysis can benefit from the experimentation interface and presents a set of scenarios. These scenarios are applied in several empirical studies thereby collecting multimodal corpora that particularly include head gestures. The capabilities of computer-aided interaction analysis for the investigation of speech, visual attention and head movements are illustrated on this empirical data.
The effects of the head-mounted display (HMD) are evaluated thoroughly in two studies. The results show that the HMD users need more head movements to achieve the same shift of gaze direction and perform less head gestures with slower velocity and fewer repetitions compared to non-HMD users. From this, a reduced willingness to perform head movements if not necessary can be concluded. Moreover, compensation strategies are established like leaning backwards to enlarge the field of view, and increasing the number of utterances or changing the reference to objects to compensate for the absence of mutual eye contact. Two studies investigate the interaction while actively inducing misunderstandings. The participants here use compensation strategies like multiple verification questions and arbitrary gaze movements. Additionally, an enhancement method that highlights the visual attention of the interaction partner is evaluated in a search task. The results show a significantly shorter reaction time and fewer errors.