Videos include a plurality of sequential pictures displayed one after another. Various techniques exist to convert video into digital form. Digital video is easy to transmit, store, and manipulate. To digitally encode a video, each picture of the video may be digitally encoded. Unfortunately, the resulting encoded picture file may be large and cumbersome. Various compression schemes have been developed to reduce the size of encoded pictures.
Each picture of a video may be encoded individually, either independent of other pictures of the video (intra-coding) or dependent on other pictures of the video (predictive coding). A picture may be organized into slices and further into macroblocks and/or pixel blocks. The encoding process may begin by transforming pixel data of the picture into transform coefficients such as through a discrete cosine transform. The coefficients are then compressed with a quantizer into quantized coefficients. The quantized coefficients are then encoded by a run-length coder. Further encoding (e.g., entropy encoding) may be used for further compressing the resulting bit stream, which is then outputted to a channel, where it may be transmitted or stored.
The size of an encoded picture is influenced by its content, and therefore, it is difficult to predict precisely a file size of an encoded picture in advance. Generally, the selection of a quantizer is the single most significant factor affecting the resulting encoded picture size. However, changes to the quantizer do not always provide a predictable corresponding change to the picture's size. Only quantized coefficients quantized to a nonzero value with a first smaller quantizer may potentially become smaller (and therefore more compressible) when quantized with the second larger quantizer. Any coefficient that is quantized to zero with the first quantizer will remain zero when quantized with the second larger quantizer, therefore not affecting the picture size. Thus, bit rate savings (i.e. the amount of compression) from changing quantizer is dependent on the number of nonzero quantized coefficients.
Moreover, changing the quantizer also can affect the perceived quality of the picture decoded for display. The larger the quantizer (and the higher the compression ratio), the worse the perceived quality of the picture.
Previous approaches to provide an encoded video of a specified size included a trial and error approach. A maximum picture size is determined, and each picture is coded with a first selected quantizer. If the resulting encoded picture exceeds the maximum picture size, a new quantizer is selected and the picture is re-encoded. Thus, several encoding ‘passes’ over the picture may be required before an encoded picture satisfying the maximum picture size is produced.
The trial and error approach is unsatisfactory for several reasons. It can be time- and resource-consuming due to the number of passes required. The trial and error approach also fails to provide an upper bound on the number of passes over a picture, thus rendering it unsuitable for real-time applications.
A second approach to encode video includes processing each macroblock in the image sequentially, progressively adjusting the quantizer as the encoder encodes the picture. A typical approach is to calculate the average macroblock size and keep track of the number of bits used so far. Before encoding a macroblock, the encoder checks the number of bits it has used so far. If it is using more bits than allocated, it uses a larger quantization step size for the next macroblock. If it is using fewer bits than allocated, it uses a smaller quantization step size for the next macroblock. Unfortunately, this sequential approach is difficult to execute simultaneously across a plurality of processors. In addition, an encoded picture may be encoded with many different quantizers, resulting in annoying variance in perceived visual quality from one macroblock to another when decoded and displayed. Further, the same quantization step size is unlikely to be used again when the image is decoded and re-encoded, resulting in nontrivial multi-generational quality loss.