It is known practice to carry out three-dimensional reconstructions on the basis of images taken by at least two synchronized cameras taking images of the same scene. The first stereovision algorithms appeared in the 1970s. Significant progress has been made in recent years. This progress concerns increasingly effective matching algorithms.
The optical sensors used include elementary receivers (for example pixels of a camera) arranged in a matrix to see the scene from separate respective solid angles, and capture images of the scene at regular intervals (generally several images per second). Each image is then represented by a table of values (one per elementary receiver) each representing a physical characteristic of a signal received from the scene by the corresponding elementary sensor, for example a luminous intensity.
More precisely, such optical sensors generate, at each time increment tk, and for each elementary receiver placed at x,y, items of information f k(x,y)=δ(t,tk)·f (x,y,t). where f is the luminous intensity perceived by the receiver placed at x, y, and δ is Kronecker's symbol. The item of information sent back by such a sensor is then the matrix or frame Ik={f k(x,y)}, xεN, yεM, where N,M are the dimensions of the matrix, this information being sent at each time increment.
Then the matching algorithm searches these items of information for patterns that can be due to one and the same element of the scene and matches the corresponding elementary sensors together. Knowing the position of these elementary receivers thus matched, it is easy to retrieve by triangulation the point of the scene that has been seen by these two elementary receivers, and therefore to incorporate it into the 3D reconstruction of the scene.
Each image can represent a size of several megaoctets, and this at the rate of several images per second (typically 24 images per second), which represents a considerable bandwidth. The 3D reconstruction algorithms then carry out a search for patterns in the images taken by the various sensors at the same instant with the aim of matching together patterns corresponding to one and the same element of the scene. These algorithms require software packages that consume a good deal of power and computing time, which are not envisionable for real-time applications.