In video coding systems it is well known to use motion estimation techniques so as to achieve compression of the data in a video image signal that represents the video image frames (frames). An exemplary coding system is defined in the CCITT recommendation H.261. Frequently, the motion between an image frame to be encoded and a reference frame already encoded is estimated. The estimation of the motion is encoded and employed as a representation of the frame. Such a representation can be used to reconstruct the encoded frame if the reference frame is available. To this end, all frames of the video image signal are partitioned into a set of blocks comprised of Q.times.R picture elements (pels). Each block of a frame to be encoded in turn is assigned a motion vector, i.e., a displacement d relative to the location (as defined by a predetermined location in each block) of the block which identifies the location of a same size block in the reference frame that best matches the block. The reference frame is typically a frame in the past relative to the frame to be encoded and most often it is the immediately preceding frame. The motion vector is thus an estimate of the motion of the block from the time instant of the reference frame until the time instant of the frame to be encoded.
To select a displacement to be employed as the motion vector a search of a set of candidate displacements is employed. Typically the candidate set of displacements is comprised of all the displacements that are within a predetermined range of the location of the block in the frame to be encoded. An exhaustive search is utilized to find the displacement that best achieves a particular predetermined matching criterion which is then assigned to be the motion vector. The matching criterion used most often is the minimization of the integral of the absolute error signal, i.e., ##EQU1## where x' is the location of an individual pel that is a member of N(x), the set of locations of all the pels defining the block located at x (where x is the same predetermined point in each block and the value of x is measured with respect to the lattice structure of a frame). I.sub.n () and I.sub.n-1 () are functions that typically yield the luminance values of the pels at the location specified by their arguments in, respectively, the frame to be encoded and the reference frame. It is noted that all displacements and locations are vectors, since they have both a horizontal and a vertical component and are therefore, like all vectors and matrices, displayed herein in boldfaced type. In some implementations, a good estimate of the optimal displacement is determined and thereafter used as the motion vector since determining such an estimate limits the search effort, i.e., time of search, required as compared with searching for the actual optimal displacement. Several well known methods for obtaining such a good estimate are, without limitation: the 3-step search, the logarithmic search and the conjugate direction search. The displacement selected as the motion vector and the corresponding error signal for each block are employed as a representation of the block. They may be further quantized or encoded as appropriate for transmission or storage.
Better compression of the data that represents the video images can be achieved by motion compensated interpolation, a motion estimation technique which incorporates an additional reference frame, which is located typically in the future relative to the frame to be coded. An interpolative system is used to predict any frames that are temporally between the reference frames. The interpolation determines for each block to be encoded a block in the past reference frame and a block in the future reference frame that when combined yield a best approximation of the block to be encoded. The combination is typically a weighted sum of the values of the pels of the selected blocks. The displacements from the block to be encoded to each of the determined blocks (d.sub.m being a displacement to a block in the reference frame in the past and d.sub.p being a displacement to a block in the reference frame in the future) are employed as estimates of the motion of the block relative to each of the reference frames and are taken as motion vectors. Additionally, an interpolation error signal is obtained by subtracting the weighted sum from the values of the pels comprising the current block.
The determination of the weights utilized for the weighted sum is typically performed by employing either predetermined limitations on the weights or predetermined limitations on the displacement candidates. For example, taking on a block by block basis either equal contribution (1/2,1/2) from both displaced blocks or selecting to use for a block a contribution from just one of the reference frames (0,1) and (1,0) (see ISO MPEG draft proposal). The criteria for selecting the weights is typically to minimize the energy of the interpolation error signal. The motion vectors, error signal and weighting factors for each block are employed as a representation of the block. They may be further quantized or encoded as appropriate for transmission or storage.
These prior compression techniques do not optimally take advantage of the available bandwidth and therefore require a higher bandwidth to provide an optimal reconstructed image.