Video coding, for example, video coding standards such as H.264/Advanced Video Coding (AVC), H.265/High Efficiency Video Coding (HEVC), and Audio Video Standard (AVS), usually uses a hybrid coding framework, and mainly includes processes such as prediction, transform, quantization, and entropy coding. Video decoding is a process of converting a bitstream into video images, and includes several main processes such as entropy decoding, prediction, dequantization, and inverse transform. First, entropy decoding processing is performed on the bitstream, to parse out encoding mode information and quantized transform coefficients. Then, on one hand, predicted pixels are obtained by using the encoding mode information and decoded reconstructed pixels; and on the other hand, dequantization is performed on the quantized transform coefficients, to obtain reconstructed transform coefficients, and then, inverse transform is performed on the reconstructed transform coefficients, to obtain reconstructed residual information. Subsequently, the reconstructed residual information and the predicted pixels are added to obtain reconstructed pixels, so as to restore the video images.
For lossy coding, a reconstructed pixel and an original pixel may be different, and a numerical value difference between the two is referred to as distortion. Generally, distortion is caused by quantization. A greater quantization parameter (QP) causes stronger distortion, a blurrier image, and in general, poorer pixel quality.
Coding based on a knowledge base is an extension of H.264/AVC and H.265/HEVC. A decoder side includes a knowledge base in which some images and/or image regions are stored and are referred to as patches. The images or patches in the knowledge base may come from decoded reconstructed images in a current decoding video. For example, some representative images are extracted from the decoded reconstructed images and are added to the knowledge base. Alternatively, the images or patches in the knowledge base may not come from reconstructed images of a current decoding video, for example, come from reconstructed images or patches obtained by decoding another video, and for another example, come from multiple images or patches pre-stored by a decoding system, where the pre-stored images or patches may be uncompressed original images. When a current video is decoded, pixel information in the knowledge base may be used. For example, predicted pixel information used during decoding may come from the pixel information in the knowledge base.
In the prediction process, predicted pixels of original pixels corresponding to a current coding block are generated by using reconstructed pixels of a coded region. Prediction manners mainly include two major types: intra-frame prediction (intra prediction) and inter-frame prediction (inter prediction). In a template matching technology of intra-frame coding and a decoder side motion vector derivation technology of inter-frame coding, a reconstructed image template around a current decoding prediction block needs to be used to search a reconstructed region in a current frame or search another reconstructed frame for one or more nearest neighbor images with minimum differences from a template of the current decoding block, where the nearest neighbor images are referred to matched images. For the two types of technologies, how to evaluate an image difference or a value-space distance between a template image and a candidate template image in a matching process is a key issue, and directly determines a final searching result. A conventional method for calculating an image difference between two images is, for example, a sum of squared errors (SSE), a sum of absolute difference (SAD), a mean square error (MSE), or a mean absolute difference (MAD) of two image pixel domains, and for another example, a sum of absolute transformed difference (SATD) of transform coefficient domains obtained by performing Hadamard transform on two images. Image difference calculation also plays a pivotal role in other processing such as image searching and image fusion. In a conventional image difference calculation method, signal quality improvement of a high-quality image relative to a low-quality image is falsely considered as a difference between the two images, and an image that has a relatively small difference and that is obtained by using a conventional difference calculation method such as a SSE of two image pixel domains may not be visually similar to an image being searched for. As a result, a subsequent image processing result is inaccurate.