An image coding apparatus for coding a moving picture segments each picture constituting the moving picture into macroblocks, and codes the moving picture on a macroblock basis. The image coding apparatus thus generates a bit stream representing the coded moving picture.
FIG. 44 is a diagram showing a structure of a picture to be coded.
The picture is segmented into macroblocks composed of 16×16 pixels, and coded. Here, a plurality of macroblocks included in the picture constitute a slice, and a plurality of slices constitute the picture. A structural unit of one row of macroblocks horizontally arranged from a left end to a right end of the picture is referred to as a macroblock line (MB line).
FIG. 45 is a diagram showing a structure of a bit stream.
The bit stream is hierarchical and, as shown in FIG. 45(a), includes a header and a plurality of pictures arranged in coding order. The header includes, for example, a sequence parameter set (SPS) referenced to for decoding a sequence composed of the plurality of pictures. As shown in FIG. 45(b), each of the coded pictures includes a header and a plurality of slices. Likewise, as shown in FIG. 45(c), each of the slices includes a header and a plurality of macroblocks (MBs). The header at the beginning of the picture in FIG. 45(b) includes, for example, a picture parameter set (PPS) referenced to for decoding the picture.
FIG. 46 is a diagram showing a structure of a conventional image decoding apparatus.
An image decoding apparatus 1300 includes a memory 1310 and a decoding engine 1320. The memory 1310 includes a stream buffer 1311 having an area for storing a bit stream, and a frame memory 1312 having an area for storing decoded image data outputted by the decoding engine 1320. The image decoding apparatus 1300 obtains coded image data such as macroblocks and pictures included in the bit stream sequentially from the beginning side, and stores the coded image data into the stream buffer 1311.
The decoding engine 1320 sequentially reads the coded image data from the stream buffer 1311 in decoding order, decodes the coded image data, and stores the decoded image data generated by the decoding into the frame memory 1312. The decoding engine 1320 decodes the coded image data with reference to the decoded image data already stored in the frame memory 1312.
The decoded image data stored in the frame memory 1312 is outputted to a display device in display order, and displayed.
FIG. 47 is a diagram showing a structure of the decoding engine 1320.
The decoding engine 1320 includes an entropy decoding unit 1321, an inverse transform unit 1322, an adder 1323, a deblocking filter 1324, a motion compensation unit 1325, a weighted prediction unit 1326, an intra-picture prediction unit 1327, and a switch 1328.
The entropy decoding unit 1321 performs entropy decoding on coded image data to generate quantized data indicating quantized values, and outputs the quantized data to the inverse transform unit 1322.
The inverse transform unit 1322 performs inverse quantization, inverse orthogonal transform, and the like on the quantized data to transform the quantized data into difference image data.
The adder 1323 generates decoded image data by adding the difference image data outputted from the inverse transform unit 1322 and predicted image data outputted from either the weighted prediction unit 1326 or the intra-picture prediction unit 1327 via the switch 1328.
The deblocking filter 1324 removes coding distortion included in the decoded image data generated by the adder 1323, and stores the decoded image data without the coding distortion into the frame memory 1312.
The motion compensation unit 1325 reads the decoded image data stored in the frame memory 1312 and performs motion compensation thereon to generate predicted image data, and outputs the predicted image data to the weighted prediction unit 1326.
The weighted prediction unit 1326 adds a weight to the predicted image data outputted from the motion compensation unit 1325, and outputs the predicted image data to the switch 1328.
The intra-picture prediction unit 1327 performs intra-picture prediction. In other words, the intra-picture prediction unit 1327 performs intra-picture prediction using the decoded image data generated by the adder 1323 to generate predicted image data, and outputs the predicted image data to the switch 1328.
In the case where the difference image data outputted from the inverse transform unit 1322 is generated by intra-picture prediction, the switch 1328 outputs the predicted image data that is outputted from the intra-picture prediction unit 1327, to the adder 1323. In the other case where the difference image data outputted from the inverse transform unit 1322 is generated by inter-picture prediction, the switch 1328 outputs the predicted image data that is outputted from the weighted prediction unit 1326, to the adder 1323.
Recent years have seen increases in resolution and frame rate of images. HD (High Definition) image coding and decoding are currently implemented, but image coding and decoding using higher resolutions and higher frame rates are also expected. More specifically, moving pictures having a so-called 4k2k resolution are under consideration for practical use.
FIG. 48 is an illustration of HD and 4k2k.
HD bit streams are distributed via terrestrial digital broadcasting, BS digital broadcasting, and the like, where pictures having a resolution of “1920×1080 pixels” are decoded and displayed at a frame rate of 30 frames per second. 4k2k bit streams are scheduled to be experimentally distributed via high BS digital broadcasting from 2011, where pictures having a resolution of “3840×2160 pixels” are decoded and displayed at a frame rate of 60 frames per second.
In short, a 4k2k bit stream has vertical and horizontal resolutions two times those of an HD bit stream, and has a frame rate two times that of the HD bit stream.
Furthermore, coding and decoding of 8k4k images (7680×4320 pixels) having vertical and horizontal resolutions two times those of 4k2k images are expected to come under consideration.
Such increases in resolution and frame rate of images inevitably result in significant increases in processing load that is placed on decoding engines of image decoding apparatuses. For example, in the case of decoding a 4k2k bit stream, the decoding engine 1320 of the image decoding apparatus 1300 shown in FIG. 46 requires an operation frequency of 1 GHz or more that is practically difficult to achieve. This is why parallel decoding processing is being considered.
FIG. 49 is a block diagram showing a structure of an image decoding apparatus that executes parallel decoding processing.
An image decoding apparatus 1400 includes the memory 1310 and a decoder 1420. The decoder 1420 includes N decoding engines 1421 (for example, N=4) which function similarly to the decoding engine 1320 shown in FIGS. 46 and 47. Each of the N decoding engines 1421 (first to N-th decoding engine 1421) extracts a portion to be processed by the decoding engine 1421 itself from a bit stream stored in the stream buffer 1311, decodes the extracted portion, and outputs it to the frame memory 1312.
Each of FIGS. 50A and 50B is an illustration of an example of parallel decoding processing.
As an example, the image decoding apparatus 1400 obtains a bit stream composed of four area bit streams, and stores the obtained bit stream in the stream buffer 1311. Each of the four area bit streams is an independent stream and, as shown in FIG. 50A, represents a moving picture in one of four areas generated by dividing one picture into four equal parts. Each of the four decoding engines 1421 (for example, N=4) in the image decoding apparatus 1400 extracts the area bit stream to be processed by the decoding engine 1421 itself from the stream buffer 1311, decodes the extracted area bit stream, and causes the moving picture to be displayed in the area corresponding to the area bit stream.
As another example, the image decoding apparatus 1400 obtains a bit stream including pictures each composed of four slices, and stores the obtained bit stream in the stream buffer 1311. The four slices are generated by dividing one picture into four equal parts in the vertical direction, as shown in FIG. 50B. Each of the four decoding engines 1421 (for example, N=4) in the image decoding apparatus 1400 extracts the slice to be processed by the decoding engine 1421 itself from the stream buffer 1311, decodes the extracted slice, and causes the moving picture to be displayed in the area corresponding to the slice.
However, generating one bit stream as four area bit streams and decoding the four area bit streams as shown in FIG. 50A requires restrictions on moving picture coding methods. That is, the whole system needs to be changed, which incurs a heavy load.
Likewise, dividing one picture into four equal parts and coding and decoding the four parts as slices as shown in FIG. 50B also requires restrictions on moving picture coding methods.
In detail, in MPEG-2 (Moving Picture Experts Group phase 2) which is a moving picture coding and decoding standard, slices are always separated at boundaries of MB lines. In H.264/AVC, the sizes and positions of slices set in pictures are arbitrary, with there being a possibility that only one slice is set in one picture. Accordingly, uniformly setting the positions and sizes of slices as shown in FIG. 50B necessitates changes in the whole system including an operational standard for digital broadcasting systems, which incurs a heavy load.
This leads to study on an image decoding apparatus that performs parallel decoding on a bit stream representing a moving picture coded according to the operational standard, with no need to restrict or change the operational standard. For example, this image decoding apparatus segments each picture in a bit stream generated according to MPEG-2 into slices, and performs parallel decoding processing on the slices.
Such an image decoding apparatus, however, cannot appropriately execute parallel decoding processing. That is, since the image decoding apparatus segments each picture into slices and decodes these slices in parallel, the image decoding apparatus cannot appropriately execute parallel decoding processing on a bit stream, such as an H.264/AVC bit stream, where the sizes and positions of slices are arbitrarily set. In other words, unequal loads are placed on a plurality of decoding engines included in the image decoding apparatus, making it impossible to achieve decoding that effectively utilizes parallel processing. For example, in the case where one picture is composed of one slice, the picture cannot be segmented, and one decoding engine is required to decode the whole picture.
In view of this, there is proposed an image decoding apparatus that performs variable length decoding on a bit stream generated according to H.264/AVC, segments each picture obtained by the variable length decoding into MB lines, and decodes the MB lines in parallel (for example, see Patent Literature (PTL) 1).
FIG. 51 is an illustration of decoding processing performed by the image decoding apparatus in Patent Literature 1.
In this image decoding apparatus, a first decoding engine decodes the 0th MB line in a picture, a second decoding engine decodes the first MB line in the picture, and a third decoding engine decodes the second MB line in the picture.
Each decoding engine sequentially decodes macroblocks from a left end to a right end of the corresponding MB line. In macroblock decoding, there is a dependency between a decoding target macroblock and macroblocks located at left, left above, above, and right above positions of the decoding target macroblock. That is, when decoding the macroblock, each decoding engine needs information obtained by decoding the left, left above, above, and right above macroblocks of the decoding target macroblock. Hence, each decoding engine starts decoding the decoding target macroblock, after the decoding of these macroblocks is completed. In the case where any of the left, left above, above, and right above macroblocks is not present, each decoding engine starts decoding the decoding target macroblock after the decoding of the other macroblocks is completed. Thus, the image decoding apparatus executes parallel decoding on macroblocks that are located two macroblocks apart horizontally and one macroblock apart vertically from each other.
However, there is an instance where the image decoding apparatus in Patent Literature 1 segments a slice included in an H.264/AVC picture. In such a case, each decoding engine needs to have a function of appropriately recognizing a segment of the slice, as the slice. This complicates the structure of the image decoding apparatus.
In view of this, there is proposed an image decoding apparatus that appropriately executes parallel decoding processing by a simple structure (for example, see Patent Literature 2).
FIG. 52 is a block diagram showing a structure of the image decoding apparatus in Patent Literature 2.
An image decoding apparatus 1100 in Patent Literature 2 includes a memory 1150 including a stream buffer 1151, a segment stream buffer 1152, and a frame memory 1153, and a decoder 1110 including a stream segmentation unit 1130 and N decoding engines 1120. The stream segmentation unit 1130 segments, for each coded picture included in a bit stream stored in the stream buffer 1151, the coded picture into a plurality of macroblock lines, and assigns each of the plurality of macroblock lines to a portion of a corresponding one of N segment streams to be generated (N is an integer equal to or greater than 2), thereby generating the N segment streams. The N decoding engines 1120 obtain the N segment streams from the stream segmentation unit 1130 via the segment stream buffer 1152, and decode the N segment streams in parallel. Moreover, in the case where, when generating the N segment streams, a slice included in the coded picture is segmented into a plurality of slice portions and assigned to a plurality of segment streams, the stream segmentation unit 1130 reconstructs, for each segment stream, a slice portion group made up of one or more slice portions assigned to the segment stream, as a new slice.
Thus, a coded picture is segmented into a plurality of macroblock lines, and each of the plurality of macroblock lines is assigned to and decoded by a corresponding one of the N decoding engines 1120 as a portion of a segment stream. This enables the N decoding engines 1120 to equally share the load of decoding processing, with it being possible to appropriately execute parallel decoding processing. For example, even in the case where an H.264/AVC coded picture is composed of one slice, the coded picture is segmented into a plurality of macroblock lines, so that the load of decoding the slice is not placed on one decoding engine 1120 but equally shared by the N decoding engines 1120.
When a coded picture is segmented into a plurality of macroblock lines, there is a possibility that a slice extending over a plurality of macroblock lines is segmented into a plurality of slice portions and these slice portions are assigned to different segment streams. In this case, the whole slice in the coded picture is not included in one segment stream. Instead, a slice portion group made up of one or more slice portions which are segments of the slice is included in each segment stream. There is also a possibility that such a slice portion group does not have a header indicating the beginning of the slice portion group and end information indicating the end of the slice portion group.
Accordingly, the image decoding apparatus 1100 in Patent Literature 2 reconstructs the slice portion group as a new slice. As a result, the decoding engine 1120 that decodes the segment stream including the slice portion group can easily recognize the slice portion group as a new slice and appropriately decode the slice portion group, without requiring special processing for recognizing the slice portion group and appropriately decoding the slice portion group. That is, in the image decoding apparatus 1100 in Patent Literature 2, there is no need to provide each of the N decoding engines 1120 with a function or a structure for such special processing. Since conventional decoding circuits can be used as the decoding engines 1120 for decoding the segment streams, the whole structure of the image decoding apparatus can be simplified.
The image decoding apparatus in Patent Literature 1 also has a problem that its performance improvement is limited, because the image decoding apparatus is capable of parallel decoding processing of macroblocks but incapable of parallel decoding processing of variable length codes.
In view of this, there is also proposed an image decoding apparatus that performs parallel decoding processing of variable length codes (for example, see Patent Literature 3).
The image decoding apparatus in Patent Literature 3 performs variable length decoding processing on a plurality of pictures or slices included in a bit stream, and stores intermediate data obtained by the variable length decoding processing in an intermediate data buffer. The image decoding apparatus extracts each picture from the intermediate data stored in the intermediate data buffer, and performs parallel decoding processing on the picture on an MB line basis using a plurality of image decoding processing units.