The video received by a user in a networked video application (e.g., video streaming or video telephony) differs from the original video. The difference, or the distortion, is incurred, for example, by quantization applied at the video encoder, and bit errors and packet losses during the transmission. The latter, referred to as “channel-induced distortion” (or “channel distortion” for short) depends on many factors including, for example, characteristics of channel errors, error resilience features applied at the encoder, error concealment techniques employed at the decoder, and the motion and texture content of the underlying sequence. Accurate estimation of the channel-induced distortion enables a video service provider to optimally select operating parameters of the source encoder, the channel encoder, and other transport error control mechanisms, to maximize the received video quality for a given channel bandwidth.
In the prior art, an analytical model was developed which relates the average channel-induced distortion with the packet loss rate and the intra-rate, by modeling the spatial-temporal error propagation behavior as a leaking filter. However, the attachment of physical meanings to the model parameters is not without difficulty.
Also in the prior art, the so-called ROPE (Recursive Optimal Per Pixel Estimate) method is known, which recursively calculates the expected difference between the original and decoded value at each pixel. The ROPE method can be used to calculate the expected distortion for a new macroblock using different coding modes (inter vs. intra) so that the encoder can choose the mode that leads to the minimal distortion. However, the ROPE method is not applicable for determining the average intra-rate before actual encoding for a given channel loss rate. Moreover, the ROPE method is also computationally intensive. Also, the ROPE method is only applicable when the encoder employs only integer motion vectors for temporal prediction, and when the decoder uses a simple error concealment method that copies the co-located blocks in the previously reconstructed frame for any lost block in the current frame.
Further in the prior art, a frame-level recursion formula (hereinafter the “conventional frame-level recursion formula”) was developed, which relates the channel-induced distortion in a current frame with that in a previous frame. However, this model is only applicable for the simple error concealment method that copies the co-located blocks in the previously reconstructed frame for any lost block in the current frame.
All of the prior art described above considers error propagation due to only temporal inter-prediction. Further, most prior art methods do not take into account non-integer motion compensation for temporal prediction and concealment, nor do they consider the effect of deblocking filtering.
Spatial intra-prediction and deblocking filtering are two new features, of the latest H.264 video coding standard and significantly improve the coding efficiency over previous standards.
Accordingly, it would be desirable and highly advantageous to have a method and apparatus for estimating channel induced distortion that overcome the above-described limitations of the prior art.