1. Field of the Invention
The invention relates to a system for bit-rate allocation and, particularly, bit-rate allocation for object-based video encoding.
2. Background of the Invention
In object-based video encoding, the video being input is broken into two streams, a first stream for a background composite of the video and a second stream for foreground regions of the video. The background composite is stationary and is represented as a composite image (e.g., a single image composed from a series of overlapping images). The background composite is encoded only once in the first stream. On the other hand, the foreground regions are moving and are encoded for every frame of the video in the second stream. Object-based video encoding is different from traditional frame-based encoding, which uses only one stream. As an option to conventional approaches for object-based video encoding, generation of the background composite and the foreground regions is discussed in commonly-assigned U.S. patent application Ser. No. 09/472,162, filed Dec. 27, 1999, and 09/609,919, filed Jul. 3, 2000, both of which are incorporated herein by reference.
Once the content of the two streams is determined, each stream is encoded at a desired bit rate. An encoder in this context includes a bit-rate allocation algorithm and the mechanics of generating the compressed (i.e., encoded) bit stream. The bit-rate allocation algorithm determines how much each video frame needs to be compressed and which frames need to be dropped to achieve a desired bit rate. If only a single stream is encoded, as in traditional frame-based encoding, all available bits are used by the bit-rate allocation algorithm to encode the single stream. In object-based encoding, which can use multiple streams, the appropriate portion of the available bits must first be assigned to each stream. Once the appropriate portion of the available bits are assigned, the bit-rate allocation algorithm processes each stream. If the appropriation of bits between streams is performed incorrectly, significant quality differences can arise between the streams when they are reconstructed.
To obtain a pleasing reconstructed video, the reconstructed quality of the background composite and the foreground regions should be similar. When encoding a background composite and foreground regions for lossy video compression, the amount of compression and resulting quality is controlled by the quantization step. As an example, the quantization step for the MPEG-4 standard is set to an integer value from 1 to 31, inclusive. A low quantization step indicates a better resulting quality of the reconstructed video because greater granularity exists in representing a pixel characteristic, such as the texture (e.g., color intensity) of the pixel. A low quantization step, however, results in the use of more bits to encode the video.
Unfortunately, simply setting the quantization step equal for both the background composite and the foreground regions does not necessarily result in similar reconstructed quality between the background composite and the foreground regions. Dissimilar reconstructed quality results because the background composite is coded essentially as an I-frame and because the quantization step is used to quantize the coefficients of the transformed pixel values for the background composite. Further, when the foreground regions are encoded, the quantization step quantizes prediction residuals. Because the same quantization step cannot generally be used to obtain a reconstructed video having the same or similar quality for the background composite and the foreground regions, an alternative basis is needed to obtain the same or similar quality for the background composite and the foreground regions in the reconstructed video.