For validly analyzing human visual attention, it is often necessary to proceed from computer-based desktop set-ups to more natural real-world settings. However, the resulting loss of control has to be counterbalanced by increasing
participant and/or item count. Together with the effort required to manually annotate the gaze-cursor videos recorded with mobile eye trackers, this renders many studies unfeasible.
We tackle this issue by minimizing the need for manual annotation of mobile gaze data. Our approach combines geo\-metric modelling with inexpensive 3D marker tracking to align virtual proxies with the real-world objects. This allows us to classify fixations on objects of interest automatically while supporting a completely free moving participant.
The paper presents the EyeSee3D method as well as a comparison of an expensive outside-in (external cameras) and a low-cost inside-out (scene camera) tracking of the eyetracker's position. The EyeSee3D approach is evaluated comparing the results from automatic and manual classification of fixation targets, which raises old problems of annotation validity in a modern context.