This invention relates generally to video encoding, and more particularly to determining distortion characteristics of videos.
A number of video coding standards support variable frameskip, e.g., H.263 and MPEG-4. With variable frameskip, any number of frames of the input video can be skipped during the coding. That is, the frames remain uncoded. With these video coding standards, the encoder may choose to skip frames of a video to either satisfy buffer constraints, or to optimize the video coding process. However, most encoders only skip frames to satisfy buffer constraints. In this case, the coder is forced to skip frames when limitations on the bandwidth cause the buffer to fill up. Consequently, it is not possible to add any additional frames to the buffer, and the frames remain uncoded until the buffer is drained. This type of frame skipping degrades the quality of the video because the content of the video is not considered.
It is a problem to provide an optimal coding method for a video. Specifically, a particular video could be coded with more frames having a lower spatial quality, or fewer frames having a higher spatial quality. This trade-off between spatial and temporal quality is not a simple binary decision, but rather a decision over a finite set of coding parameters. Obviously, the best set of coding parameters will yield the optimal rate-distortion (R-D) curve. The two parameters of interest are the number of frames per second (fps) and a quantization parameter (QP). In the known prior art, the total distortion is measured only for coded frames, and is expressed as the mean-squared error (MSE) between pixels in the original and compressed video.
Prior art optimized coding methods do not consider the temporal aspect of rate-distortion, see H. Sun, W. Kwok, M. Chien, and C. H. John Ju, xe2x80x9cMPEG coding performance improvement by jointly optimizing coding mode decision and rate control,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., June 1997, T. Weigand, M. Lightstone, D. Mukherjee, T. G. Campbell, S. K. Mitra, xe2x80x9cR-D optimized mode selection for very low bit-this rate video coding and the emerging H.263 standard,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., and April 1996, J. Lee and B. W. Dickenson, xe2x80x9cRate-distortion optimized frame type selection for MPEG encoding,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., June 1997. Generally, it is assumed that the frame-rate is fixed.
These methods consider optimizations on the quantization parameter, H. Sun, W. Kwok, M. Chien, and C. H. John Ju, xe2x80x9cMPEG coding performance improvement by jointly optimizing coding mode decision and rate control,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., June 1997, mode decisions for motion and block coding, T. Weigand, M. Lightstone, D. Mukherjee, T. G. Campbell, S. K. Mitra, xe2x80x9cR-D optimized mode selection for very low bit-this rate video coding and the emerging H.263 standard,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., April 1996, and frame-type selection, J. Lee and B. W. Dickenson, xe2x80x9cRate-distortion optimized frame type selection for MPEG encoding,xe2x80x9d IEEE Trans. Circuits Syst. Video Technol., June 1997. Such methods can achieve an optimum coding when the frame-rate is fixed, and the bit rate can be for the given frame-rate. However, these methods are less than optimal for varying frame-rates.
It should be noted that the trade-off between spatial and temporal quality, while coding, has been described by F. C. Martins, W. Ding, and E. Feig, in xe2x80x9cJoint control of spatial quantization and temporal sampling for very low bit rate video,xe2x80x9d Proc. ICASSP, May 1996. However, in their method, the trade-off was achieved with a user selectable parameter.
Therefore, it is desired to provide a better method for determining distortion in a video.
The present invention provides a method for determining the distortion in a video subject to variable frameskip processing. If the input video is uncompressed and to be coded, then the distortion is an estimate of the distortion in an output compressed video, whereas if the input video is compressed, then the distortion is an actual measure of distortion in the compressed video.
The distortion for coded frames (candidate or actual) is given by a rate-distortion model, and the distortion for uncoded frames (candidate or actual) can be based on an optical flow in the video. The method according to the invention produces accurate distortion values over a range of videos with varying scene complexities. In the case the input video is uncoded, the method can be used for optimizing the trade-off between spatial and temporal quality in a video coder. For a compressed video, the method can also be used to compare the relative quality without having access to the original video.
More particularly, a method determining distortion in a video by measuring a spatial distortion in coded frames, and by measuring a temporal distortion and spatial distortion in uncoded frames. The spatial distortion of the coded frames is combined with the temporal distortion and the spatial distortion of the uncoded frames to determine a total average distortion in the video.