In everyday life individuals are inexorably exposed to complex visual context in which they concurrently track and integrate multiple moving objects in their visual field. For example, a driver will attend and spatially integrate moving targets such as cars and/or pedestrians. In such environments, perceptual integration of dynamic visual targets is fundamental in order to produce good decision-making processes and appropriate motor responses. Experiments on multiple-object tracking (MOT) ability have demonstrated that attention could be allocated to more than a single focus position contrary to what was generally postulated.
Complete understanding of the mechanisms inherent to MOT is not yet achieved. Different models propose interesting views for a theoretical understanding of the mechanisms involved in this cognitive process.
For example, a first model, the so called FINSTs model, refers to pre-attentive indexes that stick to the moving targets and facilitate attention to assess these indexed objects.
Another example is a grouping model which proposes that during a visual tracking task, the targets are grouped into a single object. The virtual linkage between targets forms the vertices of a deformable polygon which is perceptually integrated while targets move across the visual field.
Finally, a multi-focal model describes the possibility to deploy independent focus of attention on each tracked target.
At an integrative level, a limit concerning the number of tracked moving targets has been previously shown. It appears that young adults are capable to track up to a maximum of five targets. However, it has been shown that this performance decrease during normal aging. It has been shown that elderly people are limited to three items in a MOT task.
At a spatial level and independently of the model considered, a recent study provides new information concerning the early stages of MOT. The results of this study suggest a limited capacity split between the right and left hemifields during the target selection stage. It has been suggested that this hemifield independence is restricted to the very early stage of MOT (selection stage). It has also been suggested that this hemifield specificity could be integrated in a retinotopic frame of reference.
However, at a space representation level and because of their two-dimensional visual space restrictions, classical studies do not take into consideration the stereoscopic power of the visual system that allows better discrimination between the relative positions of multiple objects in space. Also, these approaches do not consider the reality of a 3D (three-dimensional) world where multiple objects move among the three dimensions of space and at different depth positions. Indeed, stereoscopic vision is a higher-level function of our visual system that permits us to have perception of depth and to evaluate if one object is situated before or behind another one in space. At a behavioural level, an individual constantly makes this kind of visual-perceptual judgment whatever the task he/she is involved in. Moreover, the benefits of stereoscopic vision in providing optimal visual cues to control action have already been shown. These studies suggest that the main impact of stereoscopic vision is disambiguating the depth information present in our 3D world in order to produce optimal behaviours. Based on these perception-action interactions, it appears that the evaluation of some specific visual mechanisms is made in environments that simulate in an ecological way the visual-spatial characteristic of our 3D world. Intuitively, this seems to apply to multiple-object tracking which corresponds to a visual-attentional mechanism that could influence many behaviours related to everyday life. However the MOT literature showed that most of the studies evaluate this visual-attentional capacity in experimental protocols restrained in 2D visual space which is drastically different from real-life conditions where tracking people and/or moving objects in crowds or during sports, such as hockey or soccer, is performed in 3D space. Based on these space representation considerations it could be irrelevant to extrapolate the results obtained to real-life tasks.
Moreover, and beyond the space representation consideration, evaluating MOT by estimating the discrete number of elements that can be tracked may not adequately represent subtle individual differences in performance on this cognitive task. Can it be concluded that the integrative capacity of two individuals is equal when both can successfully track four targets? Based on the number of targets tracked, can it be really assumed that two experimental conditions did not differ from each other?
Beyond the limit of the number of objects tracked, there is a need to develop a new approach characterizing sub-parameters that better reflect the efficiency of the attention processes involved in multiple-object tracking.