In scalable video coding (SVC) that has been standardized by the Joint Video Team (JVT) of the Moving Picture Experts Group (MPEG) and the International Telecommunication Union-Telecommunication Standardization Sector (ITU-T), adaptive reference fine grain scalability (AR-FGS) is a technique for improving coding efficiency by performing temporal prediction in fine grain scalability (FGS) coding of signal-to-noise ratio (SNR) scalability.
SNR scalable techniques improve display quality in proportion to a received bitrate according to variable network conditions. FGS is a representative SNR scalable technique, and is used to receive a bitstream that is cut according to network conditions and to improve display quality in proportion to the amount of bitstream transmitted. However, FGS cannot know a bitrate to be received and thus cannot have a temporal prediction scheme that produces high coding efficiency improvement in a video codec. If a temporal prediction scheme is used in FGS with no regard for such a characteristic of FGS, drift occurs due to a mismatch between reference images for motion compensation in an encoder and a decoder, resulting in sharp performance degradation in terms of a reproduced image and coding efficiency.
Adaptive reference fine grain scalability (AR-FGS) exploits both efficient drift control and improvement in the performance of a temporal prediction scheme. AR-FGS generates a reference block or a reference macroblock for motion compensation using a weighted sum of reference blocks obtained in partially decoded upper layer and lower layer. Using an AR-FGS method implemented in this way, FGS coding performance can be improved and drift can be controlled.
FIG. 1 is a conceptual diagram illustrating generation of a reference block in AR-FGS according to the prior art.
Referring to FIG. 1, the size of a block is M×N and Xn is a signal of a block to be coded in an FGS layer (enhancement layer). Ran is a signal of a motion compensation reference block generated by a weighted sum of a base layer and the enhancement layer. A signal of a reference block in the enhancement layer is indicated by Ren-1 and a quantized coefficient of the base layer is indicated by Qbn and transformation is indicated by FX=ƒ(X). A quantized transformation coefficient of the base layer is indicated by Qbn(u,v).
In AR-FGS, a reference block is generated in the following two ways.
1. If quantized coefficients in a base layer are all 0, a reference block is generated by a weighted sum of a counterpart block in the base layer and a counterpart block in an enhancement layer using α as a weight value for the enhancement layer and 1−α as a weight value for the base layer, as follows:Ran=(1−α)·Xn+α·Ren-1 if Qbn=0  (1)
2. In the other cases in which at least one quantized coefficient in the base layer is not 0, a reference block is generated in a transformation coefficient domain. If a transformation coefficient of the transformation coefficient domain in a position corresponding to the base layer is 0, a transformation coefficient corresponding to the base layer is multiplied by 1−βl and a transformation coefficient corresponding to the enhancement layer is multiplied by β in the transformation coefficient domain, thereby obtaining a sum of the multiplication results as a transformation coefficient as in Equation 2. If a transformation coefficient of the transformation coefficient domain in a position corresponding to the base layer is not 0, a signal of the base layer is used as in Equation 3. A reference block is generated by inverse transformation with respect to the obtained transformation coefficient.FRan(u,v)=(1−β)·FXbn(u,v)+β·FRen-1(u,v) if Qbn(u,v)=0  (2)FRan(u,v)=FXbn(u,V) if Qbn(u,v)≠0  (3)
Weight values are provided for each slice, and a weight value α for a case where values of residues of all pixels in a block of a base layer are all ‘0’ and a weight value β for a case where some values of residues of all pixels in a block of a base layer are not ‘0’ and thus some transformation coefficients obtained by transformation into a discrete cosine transformation (DCT) domain are not ‘0’ are separately transmitted. The weight values (α,β) are weight values of an upper layer and range between 0 and 1. Weight values of a lower layer are (1−α, 1−β).
FGS coding is performed using the generated reference block by exploiting the advantage of a temporal prediction scheme. When compared to conventional FGS coding, such FGS coding exhibits improved performance in real-time based video coding as well as general video coding.
Video coding techniques such as the MPEG-4 standard and the H.264 standard use various prediction schemes. Among these prediction schemes, a skip mode is a mode in which block data of a base layer does not exist and data of a reference picture is used, i.e., there is no temporal data change. Thus, performance improvement may be expected using data of a reference picture on the assumption that there may be no data change in an enhancement layer. Even if transmission is not performed, drift is not likely to occur due to incorrect reference in a skip-mode block.