The disclosed embodiments of the present invention relate to video coding, and more particularly, to a video coding method using at least evaluated visual quality determined by one or more visual quality metrics and a related video coding apparatus.
The conventional video coding standards generally adopt a block based (or coding unit based) coding technique to exploit spatial redundancy. For example, the basic approach is to divide the whole source frame into a plurality of blocks (coding units), perform prediction on each block (coding unit), transform residues of each block (coding unit) using discrete cosine transform, and perform quantization and entropy encoding. Besides, a reconstructed frame is generated in a coding loop to provide reference pixel data used for coding following blocks (coding units). For certain video coding standards, in-loop filter(s) may be used for enhancing the image quality of the reconstructed frame. For example, a de-blocking filter is included in an H.264 coding loop, and a de-blocking filter and a sample adaptive offset (SAO) filter are included in an HEVC (High Efficiency Video Coding) coding loop.
For many applications (e.g., a video streaming application), the transmission channel in used typically has a limited transmission bandwidth. Under this circumstance, the encoder's output bitrate must be regulated to meet the transmission bandwidth requirement. Thus, rate control may play an important role in video coding. In general, the conventional rate control algorithm performs the bit allocation based on pixel-based distortion such as spatial activity (image complexity) of a source frame to be encoded. However, the pixel-based distortion merely considers source content complexity, and sometimes is not correlated to the actual visual quality of a reconstructed frame generated from decoding an encoded frame. Specifically, based on experimental results, different processed images, each derived from an original image and having the same distortion (e.g., the same mean square error (MSE)) with respect to the original image, may present different visual quality to a viewer. That is, the smaller pixel-based distortion does not mean better visual quality in the human visual system. Hence, an encoded frame generated due to the conventional distortion-based rate control mechanism does not guarantee that a reconstructed frame generated from decoding the encoded frame would have the best visual quality.