TY - THES AB - Object recognition and tracking are the main tasks in computer vision applications such as safety, surveillance, human-robot-interaction, driving assistance system, traffic monitoring, remote surgery, medical reasoning and many more. In all these applications the aim is to bring the visual perception capabilities of the human being into the machines and computers. In this context many significant researches have recently been conducted to open new horizons in computer vision by using both 2D and 3D visual aspects of the scene. While the 2D visual aspect represents some data about the color or intensity of the objects in the scene, the 3D denotes some information about the position of the object surfaces. In fact, these aspects are two different modalities of vision which should be necessarily fused in many computer vision applications to comprehend our three-dimensional colorful world efficiently. Nowadays, the 3D vision systems based on Time of Flight (TOF), which fuse range measurements with the imaging aspect at the hardware level, have become very attractive to be used in the aforementioned applications. However, the main limitation of current TOF sensors is their low lateral resolution which makes these types of sensors inefficient for accurate image processing tasks in real world problems. On the other hand, they do not provide any color information which is a significant property of the visual data. Therefore, some efforts have currently been made to combine TOF cameras with standard cameras in a binocular setup. Although, this solves the problem to some extent, it still deals with some issues, such as complex camera synchronization, complicated and time consuming 2D/3D image calibration and registration, which make the final solution practically complex or even infeasible for some applications. On the other hand, the novel 2D/3D vision system, the so-called MultiCam, which has recently been developed at Center for Sensor Systems (ZESS), combines a TOF-PMD sensor with a CMOS chip in a monocular setup to provide high resolution intensity or color data with range information. This dissertation investigates different aspects of employing the MultiCam for a real time object recognition and tracking to find advantages and limitations of this new camera system. The core contribution of this work is threefold: In the first part of this work, the MultiCam is presented and some important issues such as synchronization, calibration and registration are discussed. Likewise, TOF range data obtained from the PMD sensor are analyzed to find the main sources of noise contributions and some techniques are presented to enhance the quality of the range data. In this section, it is seen that due to the monocular setup of the MultiCam, the calibration and registration of 2D/3D images obtained from the two sensors is simply attainable [12]. Also, thanks to a common FPGA processing unit used in the MultiCam, sensor synchronization, which is a crucial point in the multi-sensor systems, is possible. These are, in fact, the vital points which make the MultiCam suitable for a vision based object recognition and tracking. In the second part, the key point of this work is presented. In fact, by having both 2D and 3D image modalities, obtained from the MultiCam, one can fuse the information from one modality with the other one easily and fast. Therefore, one can take the advantages of both in order to make a fast, reliable and robust object classification and tracking system. As an example, we observe that in the real world problems, where the lighting conditions might not be adequate or the background is cluttered, 3D range data are more reliable than 2D color images. On the other hand, in the cases where many small color features are required to detect an object, like in gesture recognition, the high resolution color data can be used to extract good features. Thus, we have found that a fast fusion of 2D/3D data obtained from the MultiCam, at pixel level, feature level and decision level, provides promising results for real time object recognition and tracking. This is validated in different parts of this work ranging from object segmentation to object tracking. In the last part, the results of our work are utilized in two practical applications. In the first application, the MultiCam is used to observe the defined zones to guarantee the safety of the personnel in a close cooperation with a robot. In the second application, an intuitive and natural interaction system between the human and a robot is implemented. This is done by a 2D/3D hand gesture tracker and classifier which is used as an interface to command the robot. These results validate the adequacy of the MultiCam for real time object recognition and tracking at the indoor conditions. AU - Ghobadi, Seyed Eghbal DA - 2010 KW - Objekterkennung KW - 2D/3D Bilder KW - PMD KW - Object Recognition KW - 2D/3D Imaging KW - Object Tracking LA - eng PY - 2010 TI - Real time object recognition and tracking using 2D/3D images UR - https://nbn-resolving.org/urn:nbn:de:hbz:467-4557 Y2 - 2024-12-26T21:21:02 ER -