1. Field of the Invention
The present invention relates generally to a method and an apparatus for scalable video coding. More particularly, the present invention relates to a method and an apparatus for macroblock based scalable video coding, which can enhance decoding and encoding speed and reduce memory usage.
2. Description of the Related Art
In the recent ubiquitous environment, interest in video communication services, for example, IPTV, mobile IPTV, and mobile broadcasting, through various networks and devices is increasing. In order to satisfy the increasing industrial needs, Scalable Video Coding (SVC) which supports spatial, temporal, and quality scalability within one video stream has been standardized as an amendment 3 of H.264/Advanced Video Coding (AVC) at the end of 2007.
One of the most significant differences between the existing H.264/AVC and the SVC is that inter-layer prediction is added to the SVC. Compared to the existing single-layer coding, computations and memory added to the SVC are mostly for intra, residual, and motion up-sampling operations of the inter-layer prediction.
In the H.264/AVC SVC, three profiles; that is, Scalable Baseline, Scalable High, and Scalable High Intra are added to the existing H.264.AVC standard. The most distinguished feature of the Scalable High profile from the Scalable Baseline profile is the support of interlaced coding tools and Extended Spatial Scalability (ESS). The SVC is specified in Annex G of the H.264 AVC standard, and includes the ESS feature provided for the encoding and the decoding of signals when edge alignment of a base layer macroblock and an enhancement layer macroblock is not maintained. Meanwhile, when the spatial scaling is at the ratio of 2 and the edges of the macroblocks are aligned through different layers, this is regarded as a special case of the ESS, which is referred to as Dyadic Spatial Scalability (DSS).
The ESS supports an arbitrary scaling ratio and an arbitrary cropping offset between the layers. In particular, to support both of the full HD video at the 16:9 aspect ratio and the VGA and QVGA mobile video of the 4:3 aspect ratio through the single encoding, it is necessary to use the H.264/AVC Scalable High profile including the ESS. However, since the up-sampling process increases the computational load in the ESS, a high-speed up-sampling algorithm for the real-time service is required. Also, for the services via various multimedia devices including mobile devices, it is quite necessary to minimize the memory usage.
In this regard, Joint Scalable Video Model (JSVM) 11 is developed based on the H.264/AVC SVC standard and describes intra and residual up-sampling methods for the inter-layer prediction in the ESS. The intra up-sampling and the residual up-sampling used in the JSVM 11 employ Picture-based Intra Up-Sampling (PIC-IUS) and PIC-based Residual Up-Sampling (PIC-RUS) methods which up-sample a Reference Layer (RL) on the picture basis. The up-sampling method of the JSVM 11 in FIG. 1 decodes the RL picture in S11, parses an Enhancement Layer (EL) slice header in S12, and performs the PIC-IUS in S13. Next, when the EL slice is an intra slice in S14, the method performs the EL decoding in S16. When the EL slice is not the intra slice; that is, is the inter slice, the method conducts the residual up-sampling in S15 and then the EL decoding in S16.
Since in the JSVM 11, up-sampling process is performed in S13 after the RL picture decoding in S11 and the EL slice header parsing in S12 as shown in FIG. 1, which part of the up-sampled RE picture in the actual EL decoding is referred to in the up-sampling process is unknown. Hence, the JSVM 11 up-samples every pixel on the picture basis in advance, stores the up-sampled pixels, and refers to and uses some of the up-sampled picture in the decoding of the EL picture. Meanwhile, since the residual up-sampling does not apply the residual prediction in the intra slice, it performs the up-sampling on the picture basis only for the inter slice as shown in FIG. 1. That is, the up-sampling method described in the JSVM 11 carries out the up-sampling on the picture basis regardless of the reference to the actual up-sampled RL picture in the decoding process of the EL picture. In result, the up-sampling method of the JSVM 11 conducts the up-sampling even for the RE samples not referred to in the EL and thus causes unnecessary operations.
In practice, the occurrence frequency of the inter-layer intra prediction is quite little in the EL inter picture and the occurrence frequency of the inter-layer intra prediction of the intra picture is just 70% or so on average in the 3-layer (SIF/SD/Full-HD) including the ESS. Also, the occurrence frequency of the residual prediction in the EL inter picture is merely 10˜30% on average. It doesn't make sense that the up-sampling is applied to the RL sample not referred to in the EL, especially as increasing image size.
The high-speed intra up-sampling method in the DSS for optimizing the SVC up-sampling operation rate in FIG. 2 is to reduce the operations by up-sampling only the intra pixels, without up-sampling every pixel of the RL picture. Referring to FIG. 2, when the decoded Base Layer (BL) is the intra frame in S30, the whole frame is up-sampled in S31. Otherwise, the method determines whether the frame includes an intra MacroBlock (MB) in S40. When the intra MB is included and a current MB or neighboring MBs is/are intra with respect to each MB of the BL in S42, the current MB is up-sampled in S43. When the above process is finished for every MB of the BL, the up-sampled picture of the BL is produced in S50 and the EL is decoded in S60.
As for the method of FIG. 2 by referring to FIG. 3, after decoding the RL picture and looping in the RL picture on the MB basis, when at least one of four neighboring MBs (Left, Right, Top, and Bottom MBs in FIG. 3) and the current MB (Cur MB of FIG. 3) is the intra MB, the Cur MB is up-sampled. Otherwise, the up-sampling of the Cur MB is skipped.
The up-sampling method of FIG. 2 with FIG. 3 reduces the operations by selectively performing the intra up-sampling while looping on the MB basis. However, the method is applicable only to the DSS and cannot save the memory. Also, the method is inapplicable to the residual up-sampling but applicable only to the intra up-sampling. Also, the MB not referred to in the EL can be up-sampled unnecessarily and thus unnecessary operations are highly likely to increase.
In the related art, there suggested no apparatus or method for reducing the operations and raising the efficiency of the memory utilization by detecting only the RL block referred to in the EL up-sampling and fulfilling the intra up-sampling and the residual up-sampling.