Motion prediction and motion compensation are important technologies in video compression. A part of bits in a compressed video bitstream are used to transmit motion vector information. Especially in the case of a low bit rate, with respect to a high-definition video, bits consumed to transmit motion vector information generally account for over 50% of the total number of bits in a bitstream. Therefore, an optimal motion vector needs to be selected to improve coding efficiency. With respect to video coding of continuous dynamic images, a plurality of continuous images is divided into three types, P, B, and I. With respect to a P-frame, frame data is compressed by prediction according to correlation between the P-frame and a previous adjacent frame (an I-frame or a P-frame). With respect to a B-frame, frame data is compressed by prediction according to correlation between a previous adjacent frame, the B-frame, and a next frame. In view of the difference between the P-frame and the B-frame, during selection of a motion vector, a motion vector set is acquired for the P-frame according to a previous frame thereof only, whereas a motion vector set is acquired for the B-frame according to both a previous frame and a next frame thereof.
A motion vector is used to depict a motion offset relationship between a frame and a reference adjacent frame. To improve accuracy in inter-frame prediction, the prior art employs a non-integer pixel interpolation technology to improve the accuracy in inter-frame prediction. FIG. 1 shows a position relationship of a ½ accuracy pixel or a ¼ accuracy pixel of pixels in an adjacent frame as a reference block in inter-frame prediction. Upper-case letters A/B/C/D/E/F . . . denote integer pixel accuracy positions, and lower-case letters b/h/j/m/t/aa/hh/dd/ee . . . denote ½ accuracy pixel positions, and lower-case letters a/c/d/e/g/i/k/n/p/q/r . . . denote ¼ accuracy pixel positions. The pixel in an integer pixel position is the original pixel of an image. The pixel in a ½ accuracy position and the pixel in a ¼ accuracy position are pixels in non-integer pixel positions acquired by integer pixel interpolation. During interpolation, an interpolation filter is used. For example, a ½ pixel b may be acquired by using an interpolation filter (1, −5, 20, 20, −5, 1)/64 to perform interpolation for integer pixel points D/E/F/G/H/I; and a ¼ accuracy pixel a may be acquired by using an interpolation filter (1, 1)/2 to perform interpolation filtering for the integer pixel point F and the ½ pixel point b.
Herein, a B-frame is used as an example. In the prior art, a motion estimation and compensation solution is as follows: Several coding blocks are included in each frame of an image; a spatial candidate motion vector of a coding block is acquired according to motion vectors of neighboring coding blocks (typically the left coding block, the upper left coding block, the upper coding block, and the upper right coding block), and median motion vectors are calculated according to the spatial candidate motion vectors; motion vectors of a coding block in the same position in a previous frame of the coding block, and motion vectors of the four neighboring and eight neighboring coding blocks are acquired, and temporal candidate motion vectors are acquired; one or a plurality of optimal motion vectors are selected, from a candidate motion vector set constituted by the spatial candidate motion vectors, the median motion vectors, and the temporal candidate motion vectors, as a forward motion vector and/or a backward motion vector for motion compensation of the current block. The same process of selecting an optimal motion vector may apply to a coding end and a decoding end. Therefore, motion vector information does not need to be transmitted, thereby saving bits for transmitting the motion vector information. A typical selection process is: using a corresponding reference block in a forward or backward reference frame which each piece of motion vector information in a candidate motion vector set of a coding block points to as a template block, using a minor position of the motion vector information to acquire a block corresponding to the template in the forward or backward reference frame, calculating differences between two templates and the blocks corresponding to the templates (a mean square error or a sum of pixel interpolation absolute values may be used), and selecting the motion vector information having the smallest difference as an optimal motion vector of the current coding block. Motion prediction or compensation is performed by using the motion vector, to implement coding and decoding.
The motion prediction and compensation can be accurate to a non-integer pixel position. Therefore, in motion prediction and compensation at the coding and decoding ends, during selection of an optimal motion vector, positions pointed to by a large number of candidate motion vectors need to be searched and an optimal motion vector needs to be selected from the candidate motion vectors and used in prediction and compensation. Because the candidate motion vectors are motion vector information of neighboring temporal and spatial related image blocks of the current coding or decoding block, when these motion vectors are served as the motion vectors of the current block, accuracy is not enough. However, it is extremely complex to conduct a pixel-level motion search. Therefore, it is a critical issue to improve coding performance of the existing solution and maintain reasonable complexity.