1. Field of the Invention
The field of the invention is the analysis and encoding of electronic pictures and, more particularly, the analysis of the motion of the points of electronic pictures of this type.
2. Description of the Prior Art
In the specific example which shall be described in detail below, the method according to the invention can be applied to the analysis of high definition picture sequences, designed to be transmitted through a channel with a limited throughput. A preferred application of this type is the transmission of high definition television on MAC channels.
However, the method of the invention can also be used in any system analyzing a sequence of pictures (for example in robotics, target-tracking, searching for spatial and/or temporal parameters, etc.).
The method according to the invention is designed to be part of a picture processing chain and to form a link in the analysis of the displacement speeds of picture points in the picture plane.
An analysis of this type is valuable in a great many ways.
For the transmission of picture sequences in a channel with limited throughput, the processing of pictures is designed to reduce the volume of data transmitted, in such a way that:
at emission, a sub-sampling operation is performed, the sub-sampling data being accompanied by "assistance data" transmitted conjointly via the data channel; PA1 at reception, a reverse operation is performed, consisting in the use of the assistance data and the sub-sampled signal to restitute a high definition signal. PA1 limits related to the algorithmic method chosen; PA1 limits related to the recursive design of most of the algorithms; PA1 limits related to the choice proposed for the starting hypothesis of the estimation algorithms. PA1 said current displacement vector of a current point at the instant t is computed according to an algorithmic prediction/correction process in which a prediction value of motion is taken as a starting hypothesis for computation, said predicted value being subsequently corrected within the process according to a method of correction by optimization of criteria, PA1 and wherein PA1 said predicted value of the motion of the current point is the value of the original displacement vector associated with a point of origin of the picture at the instant (t-1), said current point being the approximate projection of said point of origin along said original displacement vector. PA1 said method achieves, in parallel, at least two computations of motion estimation using two predicted values of motion of the current point, said predicted value being chosen from among the following values: PA1 and wherein PA1 said method chooses the estimation of the speed vector according to a method of decision by optimization of criteria.
In this scheme, the picture point motion estimation step, according to the present invention, occurs, for example, prior to the sub-sampling operation. The purpose of motion estimation, then, is to create a spatial-temporal data base, wherein the pieces of data represent the motional activity of the points, in the plane of the picture and in time. These are pieces of data which will make it possible to determine the most appropriate processing operation to accomplish the compression of data by sub-sampling.
An already known method in the field of data compression processing operations, with picture motions taken into account, is the one for the analysis of structures in spatial-temporal sub-sampling of an HDTV signal with a view to its transmission in an MAC channel as described in the proceedings of the HDTV 87 colloquium, Ottawa, 4th to 8th October, 1987, Vol. 1, pp. 6.2.1 (P. BERNARD, M. VEILLARD, CCETT). In this known method of analysis, each picture in the sequence is divided into zones, and each zone systematically undergoes three parallel processing operations in three distinct linear filters. Each filter provides for a different sub-sampling filtering operation, corresponding to a preferred filtering operation for still (motionless) pictures, pictures in moderate motion and pictures in fast motion, respectively. The outputs of the filters are then compared with the original source, and the best type of filtering is selected to determine the effectively transmitted, compressed signal.
A battery of linear filters of this type has the drawback of enabling only a limited choice among only three types of sub-sampling filtering operations, without the possibility of making the filtering operation more specifically to each particular feature of the zones processed. It has been noted, in particular, that there is a heterogeneity in the definition of distinct zones within one and the same picture, as well as low-level performance by this system in the processing of slow motions. This problem appear very clearly for example, in a very troublesome way for the viewer, when a slow motion is stopped or, again, when a still object is put into motion. In these instances, there is a sudden transition, from a blurred definition of the moving object to a maximum definition of a still object in the former case, and vice versa in the latter case.
A more refined, prior art approach to the problem of the encoding of a sequence of pictures consists, then, in making an a priori estimation of the motion in the picture sequence.
In this respect, T. S. HUANG ("Image Sequence Analysis: Motion Estimation" in Image Sequence Analysis; Ed. T. S. HUANG, Springer Verlag 1981) identifies three distinct methods, namely the FOURIER method, the "correspondence" or block matching method and the method using spatial or temporal gradients. The former two methods have a certain number of drawbacks. The FOURIER method is associated with a problem of phase indeterminacy and assumes uniformity of the picture background. The block-matching method appears to be likely to entail complex operations for processing the signal, for which attempts at simplification appear to cause risks of divergence in the processing algorithm.
Among methods using spatial and temporal gradients, a number of proposed algorithms are known: LIM, J. O. and MURPHY J. A., "Measuring the Speed of Moving Objects from Signals", IEEE Trans. on Com., April 1975, pp. 474-478; NETRAVALI, A. N., ROBBINS, J. D. "Motion Compensated Television Coding: Part I", BSTJ, Vol. 58, No. 3, March 1979, pp. 631-670; SABRI, S., "Motion Compensated Interframe Prediction for NTSC Color TV Signals", IEEE Trans. on Com., Vol. COM 32, No. 8, August 1984, pp. 954-968; ROBERT, P. "Definition d'un Schema de Codage Multimodes avec Compensation de Mouvement pour les Sequences d'Images de Television" (Definition of a Multimode Encoding Scheme with Motion Compensation for Television Picture Sequences), IRISA thesis, November 1983; LABIT, C. "Estimation de Mouvement dans une Sequence d'Images de Television" (Estimation of Motion in a Television Picture Sequence), IRISA thesis, Rennes, February 1982; WALKER, D. R., RAO, K. R. "New Technique in Pel-Recursive Motion Compensation" ICC 1984, Amsterdam, pp. 703-706.
These known estimation methods come up, in fact, against three types of limits:
With respect to the limits for algorithmic methods, the known methods can essentially be classified under two groups: algorithms that seek the components of the speed vector attached to a block of pictures (block matching) and algorithms that seek the components of the speed vector attached to a picture point. The criteria used to choose either of the algorithmic methods are essentially related to the complexity of the processing operations used, and to the psycho-visual perception of the relative efficiency attached to each technique.
For the method according to the invention, it has been chosen to work preferably with a pel-recursive motion estimator and, preferably but not restrictively, with the motion estimator as described by WALKER and RAO. The reasons for this choice, which are part of the inventive step that has resulted in the method, shall appear below.
It will be noted that the method nevertheless applies equally well to the block motion estimation, each block being capable of being represented by a single representative, which may be vectorial as the case may be.
The second limit is related to the recursive character of most of the known algorithms. Recursivity has the drawback of requiring several computation loops to estimate the motion of a point. These operations are, therefore, necessarily sequential, since the order n estimation can be assessed only after the order n-1 estimation is known. At current TV frequencies, this method is incompatible or, at the very least, disadvantageous.
Finally, a third type of limitation is related to the initialization mode presently recommended for known motion estimation algorithms, and essentially for estimation algorithms known as pel-recursive algorithms. Moreover, these initializing modes are generally related to the algorithmic technique and to the mode of recursivity chosen. From this point of view, it is possible to distinguish two main techniques of recursion corresponding to an improvement in point motion estimation, depending either on a spatial interpolation (see, for example, A. N. NETRAVALI, J. D. ROBBINS, already cited; P. ROBERT, C. CAFFORIO, F. ROCCA "Time/Space Recursions for Differential Motion Estimation", 2nd Internat. Tech. Symp. on Optical and Electro Optical Applied Science and Engineering, Cannes, December 1985; B. K. P. HORN, B. G. SCHUNCK, "Determining Optical Flow", Artificial Intelligence, Vol. 17, pp. 185-203, 1981; W. ENKELMANN, "Investigations of Multigrid Algorithms for the Estimation of Optical Flow Fields in Image Sequences", Workshop on Motion: Rep. and Analysis, IEEE, May 1986, Charleston), or a temporal interpolation (Y. NINOMIYA, Y. OHTSUKA, "A Motion Compensated Interframe Coding Scheme for Television Pictures", IEEE Transactions, Vol. Com. 30 , No. 1, January 1982, pages 201-211; R. PAQUIN, E. DUBOIS "A Spatio-Temporal Gradient Method for Estimating the Displacement Vector Field in Time-Varying Imagery", Computer Vision, Graphics and Image Process, Vol. 21, 1923, pp. 205-221). The temporal interpolation is more especially suited to the processing of still picture sequences whereas the spatial interpolation is essentially satisfactory for fast picture sequences. By contrast, the available laws are ill-suited to slow motions for which they show high degree of directional streaking, causing substantial and incoherent pollution in operations for the processing of picture encoding.