Recently, there have been widely-used standards for video compression techniques. Examples of such standards include H.261 and H.263 by the ITU-T (International Telecommunication Union Telecommunication Standardization Sector), MPEG (Moving Picture Experts Group)-1, MPEG-2, MPEG-4, etc. by the ISO/IEC (International Organization for Standardization/International Electrotechnical Commission), and H.264/MPEG-4 AVC (Advanced Video Coding) by the JVT (Joint Video Team) as a joint team of the ITU-T and the MPEG. Furthermore, the next generation video compression technique is now being studied by the ITU-T, the ISO/IEC, etc.
One of the important elements of an image compression technique is orthogonal transform that is performed to reduce redundancy in spatial direction. Here, orthogonal transform is an approach for reducing the amount of bits to be transmitted by performing bit allocation utilizing the nature that a cross correlation between adjacent pixels is high in an image signal and thus orthogonal transform using appropriate orthogonal transform bases produces variation in energy of transformed coefficients. For example, in image compression standards such as H.261, H.263, MPEG-1, MPEG-2, MPEG-4, and H.264 (MPEG-4 AVC), discrete cosine transform (hereinafter also referred to as “DCT”) is used as such orthogonal transform. It is known that DCT provides, especially for a natural image signal, performance close to performance of the Karhunen-Loeve transform (hereinafter also referred to as “KLT”) that is the optimum transform. A natural image in DCT has characteristics that energy is concentrated in the low frequency area and is little in the high frequency area.
On the other hand, the orthogonal transform proposed by ITU-T and ISO/IEC as being one of the next generation image compression techniques utilizes the Karhunen-Loeve transform that is the optimum transform in order to achieve a higher coding efficiency (Non Patent Literature (NPL) 1).
In general, when using the Karhunen-Loeve transform as orthogonal transform, information about orthogonal transform bases of a current image must be transmitted because such orthogonal transform bases depend on the content of the current image. For this reason, the amount of information required to be transmitted in the Karhunen-Loeve transform is larger, by the amount of the orthogonal transform bases, than in orthogonal transform such as DCT that does not require transmission of information about such bases. Here, orthogonal transform bases means an orthogonal transform matrix, and thus may be referred to as an orthogonal transform basis matrix.
On the other hand, Patent Literature 1 discloses an approach for increasing coding efficiency by deriving information about bases from a reference image in motion compensation, instead of including such information about the orthogonal transform bases in a coded stream. In addition, Patent Literature 1 discloses an approach for reducing the amount of information about such orthogonal transform bases required to be transmitted, by previously defining orthogonal transform bases to be used for decoding in orthogonal transform for each of prediction modes in intra prediction included in a coded stream, and performing switching between the orthogonal transform bases for decoding according to the prediction modes in the intra prediction.
FIG. 34 is a block diagram showing a structure of a conventional image decoding apparatus according to Patent Literature 1. An image decoding apparatus 1010 shown in FIG. 34 includes a motion compensation unit 1207, an inverse quantization unit 1215, an inverse orthogonal transform unit 1216, a variable length decoding unit 1220, a frame memory 1222, and a transform basis accumulation unit 1251. When a coded image stream 1214 is received by this image decoding apparatus 1010, the variable length decoding unit 1220 detects synchronization words indicating the starting portions of the frames included in the coded image stream 1214. Next, the variable length decoding unit 1220 recovers, for each of the macroblocks, orthogonal transform basis ID information 1250, a motion vector 1205, and a quantized orthogonal transform coefficient 1221 used for each unit of orthogonal transform.
The motion vector 1205 is transmitted to the motion compensation unit 1207. Here, the motion compensation unit 1207 extracts, from a frame memory 1222, a prediction image 1206 that is an image portion including a motion corresponding to the motion vector 1205. The quantized orthogonal transform coefficient 1221 is decoded after being subjected to processing by the inverse quantization unit 1215 and the inverse orthogonal transform unit 1216, and added to the prediction image 1206 to be a final decoded image 1217.
The transform basis accumulation unit 1251 stores the same orthogonal transform basis set Ai as the one stored at the image coding apparatus side. Based on the orthogonal transform basis ID information 1250, an orthogonal transform basis 1219 is selected, and the selected orthogonal transform basis 1219 is output to the inverse orthogonal transform unit 1216. The inverse orthogonal transform unit 1216 inversely transforms the orthogonal transform coefficient using the selected orthogonal transform basis 1219 to recover a signal on an image space. The decoded image 1217 is output to a display device with predetermined display timing, which results in reproduction of a video.
As described above, in order to decode an image stream coded using KLT requires storing orthogonal transform bases into the transform basis accumulation unit 1251 and reading the orthogonal transform bases from the transform basis accumulation unit 1251. For this reason, access for the reading and storage of the orthogonal transform basis information significantly increases the memory bandwidth and memory access latency required for the transform basis accumulation unit.