We experience our environment through different sensors that deliver large amounts of visual, acoustic, and other sensory data. Humans as well as computer systems that aim at extracting knowledge from sensory data and/or communicating with a human user or a second computer system need some kind of symbolic representation. The visual sensory data that is available for a computer system is constituted by digital images and the task of assigning symbolic knowledge to them is called object recognition.
Due to the significance of the object recognition task in computer science there exist many approaches and implementations addressing this problem area. Thereby the general procedure is calculating features and groups from the data and matching them to some kind of represented object knowledge for assigning labels. Existing implementations can be roughly distinguished by their processing strategy following either the data driven or the conceptually driven approach. Furtheron, many different knowledge representation schemes exist. For example, knowledge about objects is either concerned with examples or with categories, or objects are either described based on their appearance or based on geometric or functional features. Within the variety of approaches, each individual provides special strengths and weaknesses leading to the conclusion that the integration of complement approaches is promising.
In this thesis I propose a general framework for equitably integrating available complementary object segmentation and recognition strategies based on either image data or higherlevel knowledge. Central part of the framework is the integration module that realizes a representation scheme for flexibly storing available segment and object label information and an interpretation process generating object hypotheses based on the represented data. The available information is structured by classifying spatial relations between segments and assigning object label information. The representation scheme is open for belated extension of the data basis with additional information originating from long running processes or higherlevel knowledge. The interpretation process selects probable object regions and label hypotheses from the amount of uncertain and partly contradictory information. The applied strategy is selecting probable object regions based on spatial relations to other segments and probabilistically integrated object label information. For avoiding premature decisions in the presence of uncertain and perhaps incompletely available information the interpretation process generates competing hypotheses that serve for integrating additional higherlevel knowledge conserving the same representation and interpretation mechanisms. For finally ranking competing results for further processing a flexible evaluation scheme supports the application of one or more general and task specific evaluation methods.
Based on the general integrating framework I realized application systems addressing two different object recognition tasks. The analysis of the two systems shows the flexibility of the integrating framework as well as the improvements achieved by the integrated system.
The proposed integrating framework offers the feasibility of efficiently realizing integrated systems for a given object recognition task based on available modules.