Among the known techniques for detecting objects in images, there are iterative algorithms for searching for the closest points, i.e. of the ICP (Iterative Closest Point) type. These ICP algorithms are known for their effectiveness in applications such as range data registration, 3D reconstruction, object tracking and motion analysis. See for example the article “Efficient Variants of the ICP Algorithm”, by S. Rusinkiewicz and M. Levoy, 3rd International Conference on 3D Digital Imaging and Modeling, June 2001, pp. 145-152.
The principle of an ICP algorithm is to use a set of points used as a model delimiting the contour of the object in order to have it correspond with a set of points that is part of the acquired data. A transformation between the known model set and the set of points of the data is estimated in order to express their geometrical relationships by minimizing an error function. The tracking of an arbitrary shape can be resolved by ICP technique when a model of this shape is provided.
The article “Iterative Estimation of Rigid Body Transformations Application to robust object tracking and Iterative Closest Point”, by M. Hersch, et al., Journal of Mathematical Imaging and Vision, 2012, Vol. 43, No. 1, pp 1-9, presents an iterative method for executing the ICP algorithm. In order to determine a rigid spatial transformation T that makes it possible to detect in an image a pattern defined by a set of points {xi} to which points of the image respectively correspond, the classic analytic, closed form solution, consisting in seeking the transformation T by minimizing an error criterion of the shape
      ∑    i    ⁢                                    y          i                -                  Tx          i                            2  where the sum concerns the set of points xi of the pattern, is replaced with an iterative solution wherein an initial estimation of the transformation T is taken, and each iteration consists in randomly taking a point xi from the pattern, in finding its corresponding point yi, in the image and in updating the transformation T by subtracting a term that is proportional to the gradient ∇∥yi−Txi∥2 relatively to the parameters of translation and of rotation of the transformation T. When the transformation T becomes stationary from one iteration to the other, the iterations stop and T is retained as the final estimation of the transformation that makes it possible to detect the pattern in the image.
In the conventional vision based on successively acquired images, the rate of images of the camera (of about 60 images per second, for example) is often insufficient for ICP techniques. The repetitive calculation of the same information in successive images also limits the performance in real time of the ICP algorithms. In practice, they are restricted to cases for detecting simple shapes that do not move too quickly.
Contrary to conventional cameras that record successive images at regular sampling instants, biological retinas transmit only very little redundant information on the scene to be visualized, and this asynchronously. Asynchronous event-based vision sensors deliver compressed digital data in the form of events. A presentation of such sensors can be consulted in “Activity-Driven, Event-Based Vision Sensors”, T. Delbrück, et al., Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2426-2429. Event-based vision sensors have the advantage of removing the redundancy, reducing latency time and increasing the dynamic range with respect to conventional cameras.
The output of such a vision sensor can consist, for each pixel address, in a sequence of asynchronous events that represent changes in the reflectance of the scene at the time they occur. Each pixel of the sensor is independent and detects changes in intensity greater than a threshold since the emission of the last event (for example a contrast of 15% on the logarithm for the intensity). When the change in intensity exceeds the threshold set, an ON or OFF event is generated by the pixel according to whether the intensity increases or decreases. Certain asynchronous sensors associate the detected events with measurements of light intensity. As the sensor is not sampled on a clock as a conventional camera, it can take the sequencing of events into account with very great time precision (for example of about 1 μs). If such a sensor is used to reconstruct a sequence of images, an image frame rate of several kilohertz can be achieved, compared to a few tens of hertz for conventional cameras.
Event-based vision sensors have promising perspectives, and it is desirable to propose effective methods for tracking objects in motion using signals delivered by such sensors.
In “Fast sensory motor control based on event-based hybrid neuromorphic-procedural system”, ISCAS 2007, New Orleans, 27-30 May 2007 pp. 845-848, T. Delbrück and P. Lichtsteiner describe an algorithm for tracking clusters (cluster tracker) that can be used for example for controlling a soccer goalkeeper robot using an event-based vision sensor. Each cluster models a mobile object as a source of event. Events that fall in the cluster change the position of the latter. A cluster is considered as visible only if it has received a number of events greater than a threshold.
In “Asynchronous event-based visual shape tracking for stable haptic feedback in microrobotics”, Z. Ni, et al., IEEE Transactions on Robotics, 2012, Vol. 28, No. 5, pp. 1081-1089, an event-based version of the ICP algorithm is presented, which is based on minimizing a cost function in analytical form.
There is a need for a method for tracking shapes that is rapid and that has good temporal precision.