Conventional video compression standards use the PSKIP mode (or the DIRECT mode) as a tool to achieve high compression efficiency. With the PSKIP mode, a macroblock is coded by motion compensation using the default predicted motion vector based on the motion vector of surrounding macroblocks, with no transform coefficient for compensating for the prediction error. Therefore, only a flag signaling the usage of the PSKIP mode for the macroblock needs be sent, which often achieves high coding efficiency.
However, as a result of the compactness of the coded representation of the macroblock, and a combination of a default (usually sub-optimal motion vector and the lack of correction of the prediction error with residual information) the distortion between the coded macroblock and original macroblock is usually high. Furthermore, when macroblocks coded with the PSKIP mode are used as references for subsequently coded macroblocks, large compression distortion is possible by PSKIP compression being propagated, resulting in low visual quality over an extended period of time. Compensating for the low visual quality can be implemented, but at the expense of more bits spent in the processing of subsequent macroblocks. This problem is extremely severe when the bitrate is low, as relatively more PSKIP modes will be used to achieve the low target bitrate.
Recent video coding standards, such as H.264 and associated implementations, make use of rate distortion based mode decision to search for optimized rate distortion trade off among possible encoding choices such as PSKIP. However, such techniques are incapable of correctly comparing PSKIP mode and other encoding modes. This is because the virtually zero coding rate of the PSKIP mode makes a “fair” definition of the rate-distortion cost extremely difficult, as traditional distortion+lambda*rate based cost. While useful for comparing the encoding tradeoff for other encoding modes, such a definition is reduced to distortion only for the PSKIP mode. The impact of the quantization parameter used for other encoding modes of the same macroblock, as a function of lambda, can not be sufficiently taken into account for the calculation due to the zero-rate.
Another possible image quality improvement could prohibit the use of PSKIP mode altogether. However, such an implementation is also sub-optimal. For low complexity content encoded at low bit rates, many macroblocks will justifiably be encoded with the default predicted motion vector and no residual information. For these macroblocks, if the PSKIP mode is prohibited, a coded representation of the same reconstructed macroblock will entail coding of a NULL Information Pattern. In H.264, each of such a pattern of NULL information uses 5 bits per macroblock, corresponding to a 200 Kbps overhead for D1 (720×480) resolution coded at 30 frames a second (assuming all MBs in a frame were coded with the NULL representation instead of the much more efficient PSKIP mode), or a 20% overhead at 1 Mbps.
It would be desirable to implement non-residual mode coding of video to take advantage of the efficiency of the non-residual mode while using the non-residual mode only for macroblocks where the non-residual mode is needed and justified.