A conventional image coding apparatus for coding a video sequence divides each picture included in the video sequence into macroblocks (hereafter, a macroblock may be referred to simply as “MB” for short). The size of a macroblock is 16 by 16 pixels (16 pixels in the horizontal direction and 16 pixels in the vertical direction). Then, the conventional image coding apparatus performs coding for each of the macroblocks in the order of raster scan. As a result, the conventional image coding apparatus generates a coded stream (a coded video sequence) by coding and compressing the video sequence. Then, a conventional image decoding apparatus decodes this coded stream on a macroblock-by-macroblock basis in the order of raster scan as well to reproduce the pictures of the original video sequence.
The conventional coding methods include the International
Telecommunication Union Telecommunication Standardization Sector (ITU-T) H.264 standard (see Non Patent Literature 1 and Non Patent Literature 2, for example). When the images coded according to the H.264 standard are to be decoded, the coded stream is firstly read and variable-length decoding is performed after each piece of header information is decoded. Then, inverse quantization and inverse frequency transform are performed on coefficient information obtained by the variable-length decoding and, as a result, a difference image is generated. Next, according to a macroblock type (mb_type) obtained by the variable-length decoding, intra-picture prediction or motion compensation is performed to generate a predicted image. After this, a reconstruction process is performed by adding the difference image to the predicted image and, as a result, a reconstructed image is generated. Lastly, deblocking filtering is performed on the reconstructed image, and then a decoded image is obtained.
In this way, the processes from variable-length decoding to deblocking filtering are performed on a macroblock-by-macroblock basis and, as a result, the coded images are decoded. It is generally known, as a method for enhancing the decoding speed, to pipeline the decoding process on a macroblock-by-macroblock basis (see Patent Literature 1, for example). By the pipeline processing performed on a macroblock-by-macroblock basis (the macroblock-based pipeline processing), a series of processes (the decoding process) from variable-length decoding to deblocking filtering is divided into stages and then these stages are performed in parallel.
FIG. 1 is a diagram showing an example of the macroblock-based pipeline processing in the case where the decoding process is divided into four stages.
In the example shown in FIG. 1, processes from a stage 0 to a stage 3 are performed on one macroblock. In the stage 0, variable-length decoding is performed on a coded stream, and coding information and coefficient information for each pixel are outputted. In the stage 1, inverse quantization and inverse frequency transform are performed on the coefficient information obtained in the stage 0 and, as a result, a difference image is generated. In a stage 2, intra-picture prediction or motion compensation is performed according to the macroblock type obtained by the variable-length decoding and, as a result, a predicted image is generated. Then, the predicted image is added to the difference image obtained in the stage 1 to generate a reconstructed image. In the stage 3, deblocking filtering is performed on the reconstructed image obtained in the stage 2. In this way, by the pipeline processing, different macroblocks are processed in the stages at the same time, which implements parallel processing and thus enhances the decoding speed. Here, a cycle of a time slot (TS) in the pipeline processing is determined according to a processing cycle of a stage having the longest processing cycle (i.e., the longest stage). On this account, when one stage is the longest and only the processing cycle of this stage is longer, this means that the other stages cannot start processing for next macroblocks until the processing of the longest stage is completed. This causes an unnecessary idle time. In order for the pipeline processing to operate effectively, it is important for the processing cycles of the stages to be equal to each other.
As described above, according to the H.264 standard, an image is coded for each 16-by-16-pixel macroblock. However, the size of 16 by 16 pixels is not necessarily optimal as a unit of coding. In general, when the image resolution is higher, the correlation between neighboring blocks is higher. On account of this, a larger unit of coding can increase the compression efficiency. In recent years, the use of high definition (HD) images has increased. Moreover, since super high resolution displays of, for example, 4K2K (4096 pixels by 2048 pixels) have been developed, the resolution of images to be processed is expected to be increasingly higher. As the image resolution becomes higher in this way, higher-resolution images cannot be effectively coded with the H.264 standard.
With this being the situation, technologies proposed as next-generation image coding standards include technologies that solve the stated problem (see Non Patent Literatures 3, 4, and 5). With these technologies, the size of a unit block of coding according to the conventional H.264 standard is made variable, thereby allowing coding to be performed for each block that is larger than the conventional 16-by-16-pixel unit block.
Non Patent Literature 3 defines a macroblock having the size larger than 16 by 16 pixels, such as a 32-by-32-pixel macroblock, a 64-by-64-pixel macroblock, and a 128-by-128-pixel macroblock at the maximum, in addition to a 16-by-16-pixel macroblock.
Hereafter, in order to be distinguished from the conventional 16-by-16-pixel macroblock, a macroblock having the size larger than 16 by 16 pixels is referred to as a super macroblock. As in the case of the H.264 standard, a super macroblock has a hierarchical structure. When a block includes four sub-blocks, a structural pattern is further defined for each sub-block.
FIG. 2 is a diagram showing possible structural patterns in the case where the size of a super macroblock is 64 by 64 pixels. With such an increase in the macroblock size, the compression efficiency for high-resolution images can be improved.
Non Patent Literature 4 defines a 32-by-32-pixel super macroblock. Non Patent Literature 4 describes the technology of performing motion compensation block by block for the case where the block is an internal sub-block of the super macroblock. Moreover, as with the technology described in Non Patent Literature 3, since the super macroblock has a hierarchical structure, motion compensation is performed according to this hierarchical structure.
FIG. 3 is a diagram showing possible structural patterns of the super macroblock described in Non Patent Literature 4. As shown in FIG. 3, Non Patent Literature 4 describes motion compensation performed per block having the size of 32 by 32 pixels, 32 by 16 pixels, or 16 by 32 pixels that is not defined according to the H.264 standard.
Non Patent Literature 5 defines a 32-by-32-pixel super macroblock, and describes the technology of performing intra-picture prediction block by block for the case where the block has the size of 32 by 32 pixels that is not defined according to the H.264 standard.
As described thus far, in order to improve the compression efficiency for high-resolution images, the methods for increasing the size of the block used as the units of coding and decoding have been proposed in recent years. Here, the methods for increasing the block unit of motion compensation and intra-picture prediction to be larger than 16 by 16 pixels have been proposed. However, as of now, no method has been proposed to perform inverse quantization (quantization in the case of coding) and inverse frequency transform (frequency transform in the case of coding) per block having the size larger than 16 by 16 pixels.