1. Technical Field
Embodiments of the present disclosure relate generally to video coding, and more specifically to techniques for perceptual encoding of video frames.
2. Related Art
Video frames generally refer to images representing moving pictures or a static scene. Video frames may be generated and displayed at rates (e.g., thirty frames per second) suitable to create the impression of continuity between the video frames to a viewer. Video frames are typically encoded prior to transmission and/or storage. The encoding may include operations such as compression, encryption, quantization, etc. At a receiving or display end, the video frames are decoded to reconstruct the original frames prior to display.
Perceptual encoding or perceptual video coding refers to video encoding techniques that make use of perceptual properties of the human visual system (HVS) in the encoding operations. For example, video frames may contain ‘redundancies’, in that the HVS does not perceive, or is less sensitive to, some of the details of the video frames. Consequently, details (or characteristics, in general) of a video frame that are deemed to have less perceptual effect (due to lesser sensitivity of the visual system) on the HVS may be treated differently in the encoding operations than details that are deemed to have a relatively greater perceptual effect.
As an example, the sensitivity of the HVS to noise and encoding artifacts in a video frame varies with the amount of ‘texture’ in the video frame. In general, the human eye is less sensitive to the presence of noise in highly textured regions of a video frame as compared to a same amount of noise in a region of the video frame with less texture. This psycho-visual property of the HVS is known as “Texture Masking”. The actual picture signal acts as a masker and masks the quantization artifacts/noise present in the signal to some extent. Different regions within a picture may have different amounts of texture or spatial detail. Coarse quantization (using fewer bits to represent a frame or a macro-block) may be more noticeable in relatively ‘flat’ regions of a frame than in regions of the frame which have high texture content. Video encoding may be designed to exploit such properties of the HVS in the encoding operations.