The present invention relates to an image processing system comprising means for the acquisition of video signals corresponding to the images in question, and means for processing these acquired signals.
For several years most image processing processes and apparatus were dedicated to the analysis of static scenes, because the majority of the applications envisaged did not take time information into account. The increase in the computational power of data processing hardware has, however, progressively made it possible to envisage the real time processing of image sequences, particularly in applications such as artificial vision, monitoring systems, the detection of movement with the extraction of mobile objects from the scene, or else for the purpose of achieving better image quality by reducing the noise affecting the image sequences.
Considerable efforts have also been made to solve the problem inherent in the processing of dynamic images. The numerous psychovisual experiments carried out in connection with movement perception by the human visual system will not be described here. It will simply be recalled that as the result of such experiments it would appear that human visual system can distinguish the relative movement of two regions consisting of random distributions of grey levels, provided that the succession of the images is sufficiently rapid in relation to the amplitude of the movement (see D. H. Ballard and Ch. M. Brown, "Computer Vision", Prentice-Hall Inc., 1982). This finding indicates that the movement is detected at the level of the image itself and not exclusively through the medium of a symbolic representation of the scene observed.
Movement detection at image level is generally effected in accordance with one of the three approaches described in T. Z. Young and K. S. Fu, "Handbook of pattern recognition and image processing", Academic Press, 1986. It may first be attempted to bring into correspondence physical points in the scene and to estimate the displacement of the camera between successive images. This approach leads to an extremely complex system of non-linear equations containing N unknowns, and also makes use of a considerable number of restrictive hypotheses. Another, simpler method utilizes the notion of an optical stream defined by the instantaneous speeds at each point X, Y of the image taken at a given moment. In this case therefore, the only concern is the projection of the threedimensional movements of the objects onto a particular plane, namely the plane of the image, and this method likewise makes use of a certain number of restrictive hypotheses indispensable for the estimation of a range of speeds, such as limited maximum velocity, spatial coherence of the range of instantaneous speeds, and so on. Finally, it is possible to operate in accordance with a third approach providing detection and estimation of movement by space-time filtrations.
Movement detection by space-time filtration is particularly interesting for the following reasons. On the one hand, it is based on psychovisual experiments which indicate that the perception of the movement by the human visual system makes use of mechanisms very similar from the formal viewpoint, and this similarity to the behaviour of organic visual systems is manifested, for example, in the case of apparent movements induced by dynamic visual illusions. In addition, the conventional space-time filters make it possible to obtain good selectivity with respect to the direction of the apparent movement detected, as well as with respect to the velocity of this apparent movement (to a lesser extent, however, because the sensitivity of man to a variation of speed is less than his sensitivity to a variation of direction). Finally, these filters permit the detection of local movements, that is to say different movements at different points of the image, in contrast to methods which seek an estimation of a global parametric movement of the scene observed.
Practically all known space-time filtration techniques agree in considering the sequence of images to be processed as a threedimensional signal f(x, y, t). The space-time filter defined in this threedimensional space x, y, t, in order to make a filtered signal equal to T(f(x, y, t)) correspond to f(x, y, t), must, in order to respect the local character rule, be entirely defined on a limited space-time support of this threedimensional space. At first sight this space would appear to be related to a conventional threedimensional space in which the objects are defined in respect of height, width and depth, but in the present case the time dimension assumes particular significance because of its inherent characteristics, particularly its irreversibility. Known space-time filters therefore often show close behaviour in x and y, associated with specific behaviour in respect of the time dimension.
The simplest space-time filtration (see for example A. P. Bernat et al., "Security applications of computer motion detection", SPIE Vol. 786, Applications of Artificial Intelligence V, 1987, p. 512-517) consists in effecting a point-to-point difference between two successive images: if two corresponding points do not have the same luminance, this difference is not zero and indicates an apparatus movement. This technique, which is sensitive to noise, can be improved by effecting median filtration or a spatial mean of the grey levels before calculating the difference between successive images. Filters produced in accordance with this technique certainly meet the condition of local character, but no selectivity is obtained in the direction of the movement or its velocity, and their rather mediocre performance can be improved only with the aid of techniques which are no longer within the field of space-time filtration properly speaking.
An improved space-time filtration technique consists in making use of the known properties of linear filters and their Fourier transforms. At the cost of special processing, such as the estimation of the space-time energy of the movement, measured by the sum of the squares of two responses of linear filters in phase quadrature, or this same estimation for the energy in phase opposition by replacing the quadratic sum by a difference in the responses of two filters in phase quadrature, it is then possible to obtain filters sensitive to the direction of the movement and/or to the direction of the displacement. The use of batteries of filters finally makes it possible to estimate both the direction and the velocity of the local displacement.
The paper by S. Beucher, J. M. Blosseville and F. Lenoir, given in November 1987 at SPIE Cambridge Symposium on Advances in Intelligent Robotics Systems, entitled "Traffic spatial measurements using video image processing: application of mathematics morphology to vehicles detection", makes use of another type of filtration, morphological filtration, for the automatic measurement of vehicle flow. This morphological filtration, however, relates only to a twodimensional image reconstructed from a mean of differences between successive images of the traffic image sequence. Here again, as with the previous filters described, there is thus a behavior of a certain type in the spatial plane of the image and a specific behavior in respect of the time direction.