This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
Video coding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC). In addition, there are currently efforts underway with regards to the development of new video coding standards. One such standard under development is the scalable video coding (SVC) standard, which will become the scalable extension to the H.264/AVC standard.
Conventional video coding standards such as MPEG-1, H.261/263/264 encode video either at a given quality setting or at a relatively constant bit rate via the use of a rate control mechanism. If the video needs to be transmitted or decoded at a different quality, the data must first be decoded and then re-encoded using the appropriate setting. In some scenarios, such as in low-delay real-time applications, this “transcoding” procedure may not be feasible.
Scalable video coding overcomes this problem by encoding a “base layer” with some minimal quality, and then encoding enhancement information that increases the quality up to a maximum level. In addition to selecting between the “base” and “maximum” qualities through inclusion or exclusion of the enhancement information in its entirety, the enhancement information may often be truncated at discrete points, permitting intermediate qualities between the “base” layer and “maximum” enhancement layer. In cases where the discrete truncation points are closely-spaced, the scalability is referred to as being “fine-grained,” from which the term “fine grained scalability” (FGS) is derived. A “progressive-refinement (PR) slice,” which is also known as FGS, is introduced in Annex F of the H.264/AVC video coding standard.
A key characteristic of FGS slices is that their truncation results in an almost proportional loss of perceptual quality. For example, truncating a FGS slice to 75% of its original length may result in losing 25% of the perceptual benefit attributable to the FGS slice, as opposed to a catastrophic failure whereby 90% of the perceptual benefit is lost.
In fine grain scalability, transform coefficients are encoded in successive refinements, starting with the minimum quality provided by AVC compatible intra/residual coding. This is accomplished by repeatedly decreasing the quantization step size and applying a modified entropy coding process similar to sub-bitplane coding. For each quality enhancement layer, the coding process for the transform coefficient refinement levels is divided into two scans. Coefficients that were not zero in the previous layer (or layers) are classified as belonging to a “significance pass.” Other coefficients belong to the “refinement pass.” The entropy coding mechanism used for a given coefficient depends on whether it was classified as belonging to the “significance pass” or the “refinement pass.” In practice, the two passes may be interleaved, so that a coefficient belonging to the refinement pass is coded between two coefficients belonging to the significance pass; however, the distinction in entropy coding remains.
When variable length codes (VLCs) are used to code coefficients in FGS slices, coefficient magnitudes are initially assumed to be 0 or 1. Coefficient magnitudes of two or greater are generally rare, and are due to H.264/AVC features such as isolated coefficient removal. To include magnitudes greater than one in the VLC design when these magnitudes have a low probability would lead to a design with reduced coding efficiency. Instead, information about the number of coefficients in a block with magnitude greater than 1, and the maximum magnitude in the block, is embedded into the end-of-block (EOB) symbol. Any such “high magnitude” coefficients are then coded following the EOB symbol. It is desirable to code these “high magnitude” coefficients in the most efficient way possible.
According to Annex F of the H.264/AVC standard, information about the number of coefficients with magnitude greater than 1 (CountMag2), and the maximum magnitude (MaxMag), is extracted from EOB symbol using the following pseudo-code:
if ( EOBsymbol < NumSigCoeff*2 ){ MaxMag = (EOBsymbol % 2) + 2; CountMag2 = (EOBsymbol / 2) + 1;} else { MaxMag = (EOBsymbol / NumSigCoeff) + 2; CountMag2 = (EOBsymbol % NumSigCoeff) + 1;}
NumSigCoeff indicates the number of coefficients that were found to have a magnitude of 1 or greater. This value is known from the coefficients coded prior to the EOB symbol. The “%” symbol indicates the “modulus” operator (or “remainder when divided by”).
As an example, one can assume a vector of significance pass coefficients is {0, 0, 1, 0, 1, 2, 1, 0}. When decoding, all magnitudes would initially be decoded as 1, i.e. {0, 0, 1, 0, 1, 1, 1, 0}, followed by an EOBsymbol=0. The NumSigCoeff value is known to be 4 from the decoded coefficient values. According to the above pseudo-code, MaxMag=(0% 2)+2=2, and CountMag2=(0/2)+1=1. Therefore, it is known that one out of the four non-zero coefficients has a magnitude of two. The only remaining question is which one out of the four has this magnitude.
The location and exact magnitudes of the coefficients are decoded using exp-Golomb codes. However, since the values of MaxMag and CountMag2 are known, it is sometimes possible terminate coding early. Continuing the above example, there are four coefficients decoded with magnitude 1. These can be written as {1, 1, 1, 1}. It is also recalled that the original coefficient values are {1, 1, 2, 1}. In decoding the exp-Golomb codes, a zero bit is decoded for the first of these coefficients, indicating that its magnitude is no greater than 1. Similarly, a zero is decoded for the second coefficient. For the third coefficient, the exp-Golomb code can be truncated and simply decoded as “1” rather than “1 0” since the maximum magnitude (MaxMag) is known. The fourth coefficient does not need to be refined at all, since it is known that CountMag2=1. Therefore, only 3 bits are required to refine the “high magnitude” values instead of 5 that would be needed if complete exp-Golomb codes were used.
Unfortunately, there are two problems with the existing approach depicted above. First, the formula used to compute MaxMag and CountMag2 from the EOB symbol is designed based on the assumption that these values are typically small (for example, less than 4). When either or both of these values are large, the number of bits required to code the EOB symbol itself becomes prohibitively large. Because motion compensation has recently been added to FGS slices, the possibility of this problem occurring is now greater than when the scheme was initially designed. Second, the exp-Golomb may not be the most efficient VLC when the values of MaxMag or CountMag2 are large. For example, in some situations it may be better to code the magnitude of each coefficient using a binary representation rather than exp-Golomb. In other situations, it may be better to use a trained VLC known to both encoder and decoder.