Out of the large range of human actions, gestures performed with the hands play a very important role during everyday life. Therefore, their automatic recognition is highly relevant for constructing userfriendly human-machine interfaces. This dissertation presents a new approach to the recognition of manipulative gestures that interact with objects in the environment. The proposed gesture recognition can serve to realize pro-active human-machine interfaces enabling technical systems to observe humans acting in their environment and to react appropriately.
For observing the motions of natural hands, a non-intrusive vision-based technique is required. For this purpose, an adaptive skin color segmentation approach that is capable of detecting skin-colored hands in a wide range of lighting conditions is developed. The adaptation step is controlled by using additional scene information to restrict updating of the color model to image areas that actually contain skin areas. With the trajectory data that results from detecting hands in a sequence of images, gesture recognition can be performed.
To recognize gestures with context, a new method for incorporating additional scene information in the recognition process is described. The context of a gesture model consists of the current state of the hand and the object that is manipulated. The current hand state is needed to capture the applicability of a gesture model while the manipulated object needs to be present in the vicinity of the hand to enable recognizing the gesture model. Through the proposed context integration, the developed recognition system allows to observe gestures that are mainly characterized by their interaction with the environment and do not have a characteristic trajectory. The performance of the gesture recognition approach is demonstrated with gestures performed in an assembly construction scenario and in a typical office environment.
The use of the recognition results for improving human-machine interfaces is shown by applying the gesture recognition in a 'situated artifical communicator' system that is situated in an assembly construction domain. Here the information about the executed gesture can be used for improving dialog interaction by providing information about the hand contents. Besides this direct improvement of the human-machine interface, the recognized gestures can also be used as context knowledge for other system components. This is demonstrated with the observation of construction gestures that provide relevant context information for the vision algorithms aiming at recognizing the objects and assemblies in the construction scenario. In this way the gesture recognition results improve the human-machine interface also indirectly.