Motion prediction and motion compensation are important technologies in video compression. A part of bits in a compressed video bitstream are used to transmit motion vector information. Especially, in the case of a low bit rate, with respect to a high-definition video, bits consumed to transmit motion vector information generally account for over 50% of the total number of bits in a bitstream. Therefore, an optimal motion vector needs to be selected to improve coding efficiency. With respect to video coding of continuous dynamic images, a plurality of continuous images are divided into three types, P, B, and I. With respect to a P-frame, frame data is compressed by prediction according to correlation between the P-frame and a previous adjacent frame (an I-frame or a P-frame). With respect to a B-frame, frame data is compressed by prediction according to correlation between a previous adjacent frame, the B-frame, and a next frame. In view of the difference between the P-frame and the B-frame, during selection of a motion vector, a motion vector set is acquired for the P-frame according to a previous frame thereof only, whereas a motion vector set is acquired for the B-frame according to both a previous frame and a next frame thereof.
A motion vector is used to depict a motion offset relationship between a frame and a reference adjacent frame. To improve accuracy in inter-frame prediction, the prior art employs a non-integer pixel interpolation technology to improve the accuracy in inter-frame prediction. FIG. 1 shows a position relationship of a ½ accuracy pixel or a ¼ accuracy pixel of pixels in an adjacent frame as a reference block in inter-frame prediction. Upper-case letters A/B/C/D/E/F . . . denote integer pixel accuracy positions, and lower-case letters b/h/j/m/t/aa/hh/dd/ee . . . denote ½ accuracy pixel positions, and lower-case letters a/c/d/e/g/i/k/n/p/q/r . . . denote ¼ accuracy pixel positions. The pixel in an integer pixel position is the original pixel of an image. The pixel in a ½ accuracy position and the pixel in a ¼ accuracy position are pixels in non-integer pixel positions acquired by integer pixel interpolation. During interpolation, an interpolation filter is used. For example, a ½ pixel b may be acquired by using an interpolation filter (1, −5, 20, 20, −5, 1)/64 to perform interpolation for integer pixel points D/E/F/G/H/I; and a ¼ accuracy pixel a may be acquired by using an integer interpolation filter (1, 1)/2 to perform interpolation filtering for the integer pixel point F and the ½ pixel point b.
Herein, a B-frame is used as an example. In the prior art, a motion estimation and compensation solution is as follows: Several coding blocks are included in each frame of an image; a spatial candidate motion vector of a coding block is acquired according to motion vectors of neighboring coding blocks (typically the left coding block, the upper left coding block, the upper coding block, and the upper right coding block), and median motion vectors are calculated according to the spatial candidate motion vectors; motion vectors of a coding block in the same position in a previous frame of the coding block, and motion vectors of the four neighboring and eight neighboring coding blocks are acquired, and temporal candidate motion vectors are acquired; one or a plurality of optimal motion vectors are selected, from a candidate motion vector set constituted by the spatial candidate motion vectors, the median motion vectors, and the temporal candidate motion vectors, as a forward motion vector and/or a backward motion vector for motion compensation of the current block. The same process of selecting an optimal motion vector may apply to a coder and a decoder. Therefore, motion vector information does not need to be transmitted, thereby saving bits for transmitting the motion vector information. A typical selection process is: using a corresponding reference block in a forward or backward reference frame which each piece of motion information in a candidate motion vector set of a coding block points to as a template block, using a mirror position of the motion vector information to acquire a block corresponding to the template in the forward or backward reference frame, calculating differences between two templates and the blocks corresponding to the templates (a mean square error or a sum of pixel interpolation absolute values may be used), and selecting the motion information having the smallest difference as an optimal motion vector of the current coding block. Motion prediction and compensation is performed by using the motion vector, to implement coding and decoding.
In the existing solutions, non-integer pixel accuracy motion vectors are centralized in the candidate motion vector set. Therefore, when an optimal motion vector is calculated by using these motion vectors, a large number of sub-pixel interpolation operations need to be performed, resulting in high complexity.