An image coding device which codes a moving picture divides pictures which form a moving picture into plural macroblocks each include 16×16 pixels (a macroblock may be abbreviated and referred to as MB). Then, the image coding device codes the macroblocks in the raster scan order. The image coding device generates a coded stream by coding and compressing the moving picture. An image decoding device decodes this coded stream on a macroblock-by-macroblock basis in the raster scan order, and reproduces pictures of the original moving picture.
One conventional image coding system is the ITU-T H.264 standard (see Non-Patent Literature (NPL) 1). An image decoding device reads a coded stream first, in order to decode an image coded in accordance with the H.264 standard. Then, the image decoding device decoding various header information pieces, and thereafter performs variable-length decoding. The image decoding device performs inverse quantization on coefficient information obtained by variable-length decoding, and performs inverse frequency transform, thereby generating a difference image.
Next, the image decoding device performs intra prediction or motion compensation according to a macroblock type obtained by variable-length decoding. The image decoding device thereby generates a predicted image. After that, the image decoding device performs a reconstruction process by adding the difference image to a predicted image. Then, the image decoding device decodes a current image by performing a deblocking filtering process on the reconstructed image.
In this manner, the image decoding device performs processing from the variable-length decoding process through the deblocking filtering process for each macroblock, to decode a coded image. As a technique of accelerating this decoding processing, a technique of executing the decoding processing by pipeline processing for macroblock units is generally used (see PTL 1). In pipeline processing performed for macroblock units, a series of processes from a variable-length decoding process through a deblocking filtering process is divided at some stages. Then, processes at the stages are executed in parallel.
FIG. 62 illustrates an example of pipeline processing performed in the case where the decoding processing described above is divided at five stages. In the example illustrated in FIG. 62, processes from a process at the first stage through a process at the fifth stage are sequentially performed on one macroblock. Then, the processes from the process at the first stage through the process at the fifth stage are simultaneously performed on plural macroblocks different from one another.
At the first stage, the image decoding device performs variable-length decoding on a coded stream, and outputs coding information such as a motion vector, and coefficient information corresponding to data on each pixel. At the second stage, the image decoding device performs inverse quantization and inverse frequency transform on the coefficient information obtained at the first stage, thereby generating a difference image.
At the third stage, the image decoding device performs motion compensation according to a macroblock type obtained by variable-length decoding, thereby generating a predicted image. At the fourth stage, the image decoding device performs a reconstruction process using the difference image obtained at the second stage and one of the predicted image obtained by motion compensation at the third stage and a predicted image obtained by an intra prediction process performed at the fourth stage. At the fifth stage, the image decoding device performs a deblocking filtering process.
In this way, the image decoding device simultaneously processes plural different macroblocks at the stages using pipeline processing. Accordingly, the image decoding device can execute parallel processing, and accelerate decoding processing.
At this time, cycles in time slots (TSs) of pipeline processing are determined based on the longest processing cycle at a stage. Accordingly, if a processing cycle at only a certain stage is long, processes on next macroblocks cannot be started at other stages until the longest process at the stage is completed. Consequently, this causes idle time. To efficiently execute pipeline processing, it is important to make settings such that the time periods for processing cycles included in pipeline processing are as equivalent as possible.
An image coding device in conformity with the H.264 standard codes an image on a macroblock-by-macroblock basis (each macroblock includes 16×16 pixels), as described above. However, 16×16 pixels do not necessarily form an optimal unit for coding. Generally, the higher a correlation between adjacent blocks is, the higher the resolution of an image is. Accordingly, compression efficiency can be further improved by increasing the size of a coding unit.
In recent years, an extremely high definition display has been developed, such as a 4K2K (3840 pixels×2160 pixels) display, for instance. Thus, it is expected that the resolution of an image to be handled will be increasingly high. The image coding device in conformity with the H.264 standard is becoming unable to code such high resolution images efficiently, along with an increase in the resolution of images as described above.
Techniques proposed as next-generation image coding standards include a technique for addressing such a problem (NPL 2). With such a technique, the size of a coding unit block in conformity with the conventional H.264 standard can be changed. The image coding device according to this technique can code an image on a block-by-block basis (each block is larger than a conventional 16×16 pixel block), and appropriately code extremely high definition images.
Specifically, a coding unit (CU) is defined as a data unit for coding in NPL 2. This coding unit is a data unit for which intra prediction for performing intra prediction and inter prediction for performing motion compensation can be switched, and is defined as a most basic block size for coding, as with a macroblock in the conventional coding standard.
The size of such a coding unit is one of 4×4 pixels, 8×8 pixels, 16×16 pixels, 32×32 pixels, 64×64 pixels and 128×128 pixels. A coding unit having the largest size is referred to as a largest coding unit (LCU).
4096-pixel data is included in a 64×64-pixel coding unit. 16384-pixel data is included in a 128×128-pixel coding unit. Thus, a 128×128-pixel coding unit includes 4 times the data of a 64×64-pixel coding unit.
FIG. 63 illustrates examples of plural coding units which include 128×128 pixels and 64×64 pixels. Furthermore, a transform unit (TU) is defined in NPL2. A transform unit is defined as a block size for frequency transform. Specifically, the size of such a transform unit is one of 4×4 pixels, 8×8 pixels, 16×16 pixels, 32×32 pixels, and 64×64 pixels.
In addition, a prediction unit (PU) is further defined as a data unit for intra prediction or inter prediction. The size of a prediction unit is selected from among various rectangular sizes of 4×4 pixels or more within a coding unit, such as 128×128 pixels, 64×128 pixels, 128×64 pixels, and 64×64 pixels.