The present disclosure relates to image reproducing techniques, and more particularly to image reproduction for decoding and reproducing moving picture streams encoded by inter-picture prediction encoding which compresses the amount of information by reducing time-directional redundancy.
In recent years, moving image encoding techniques such as MPEG-2 (ISO/IEC 13818-2), MPEG-4 (ISO/IEC 14496-2), and H.264 (ISO/IEC 14496-10) have been actively researched and applied to various fields such as computers, communications, household AV equipment, and broadcasting.
In such moving image encoding techniques, the amount of information is compressed by using the two types of encoding of intra-picture prediction encoding which reduces spatial-directional redundancy in a single picture, and inter-picture prediction encoding which reduces time-directional redundancy by generating a predictive image with reference to previous and subsequent pictures (i.e., reference images), which have been already encoded and decoded, and encoding the difference value between the obtained predictive image and the picture to be encoded. In order to decode and reproduce a moving picture stream, which has been subjected to the inter-picture prediction encoding, a reference image memory for temporarily storing the reference pictures to generate the predictive image is required. The reference image memory may be generally an external memory such as a DRAM, or a memory embedded in a system LSI. Memory access to the reference image memory occurs to generate the predictive image.
On the other hand, in recent years, high-definition flat panel displays such as large-screen plasma displays and liquid crystal televisions have been rapidly spread on the market. Meanwhile, small household cameras such as digital video cameras and digital still cameras, which can record high-definition television (HDTV) moving pictures, have been purchased at reasonable prices of about tens of thousands yen. In order to decode and reproduce moving picture streams captured by such a camera, a reference image memory with a high-frequency memory bandwidth needs to be mounted, thereby normally decoding the moving picture streams without any problem even when frequent memory access (i.e., traffic) to the reference image memory occurs.
In order to secure a high-frequency memory bandwidth, it is necessary to use, for example, a plurality of DRAMs with a data bit width of 32 bits, or high-performance DRAMs such as low power double data rate 2 (LPDDR2)-SDRAMs, which operate even at a high-speed operational frequency. However, in each case, packaging costs and power consumption increase, thereby causing difficulty in reducing manufacturing costs and the power consumption. In particular, reduction in the costs and the power consumption are sharply demanded in small household cameras such as digital video cameras and digital still cameras operating with small batteries. Therefore, reduction in the costs and the power consumption are actively researched in decoding moving picture streams.
Next, a general method of decoding and reproducing a moving picture stream will be described below.
FIG. 11 simply illustrates a conventional image reproducing device, which receives and sequentially decodes a moving picture stream and outputs a reproduction image. The moving picture stream to be reproduced is input from an input terminal 2. In a decoder 603, a picture layer, a slice layer, and a macroblock layer are sequentially decoded in each picture. The decoded picture is output from an output terminal 13 to a display controller (not shown). The pictures (e.g., I/P-pictures) to be left as reference images are written and temporarily stored in a reference image memory 5. Where the reference image memory 5 has, for example, a 32-bit data bus, the picture is written such that 4 pixels (i.e., 8 bits×4 pixels=32 bits) are stored in a single address. Where the reference image memory 5 has a 64-bit data bus, the picture is written such that 8 pixels (i.e., 8 bits×8 pixels=64 bits) are stored in a single address.
In decoding a picture (e.g., a P/B-picture), which has been subjected to inter-picture prediction encoding, reference images stored in the reference image memory 5 are sequentially read to generate a predictive image. The predictive image is added to a decoded difference value and output from the output terminal 13 to the display controller (not shown).
In reading each of the reference images, which are temporarily stored in the reference image memory 5 in the above-described manner, the initial read address of the reference image in the two-dimensional space is calculated based on the position of the macroblock to be decoded on the screen and the motion vector value of the macroblock. The initial read address is then converted to a read address (e.g., a 4-pixel address in a memory with a 32-bit data bus) of the reference image memory 5.
FIGS. 13A and 13B illustrate examples. Assume that the calculated initial read address of the reference image in the two-dimensional space corresponds to the boundary of addresses read from the reference image memory 5 (FIG. 13A). Read traffic of 256 bytes (64 addresses of 4 pixels) occurs in reading for generating a predictive image of 16×16 pixels, thereby causing no transfer including a pixel, which is invalid in reading. On the other hand, assume that the calculated initial read address of the reference image in the two-dimensional space does not correspond to the boundary of addresses read from the reference image memory 5 (FIG. 13B). Read traffic of 320 bytes (80 addresses of 4 pixels) occurs in reading for generating a predictive image of 16×16 pixels, thereby causing much transfer including a pixel, which is invalid pixel in reading, to increase overhead in reading.
According to the MPEG-2 standard, since motion compensation is performed in each relatively large block of 16×16 pixels in generating the predictive image, overhead in reading is not so problematic. By contrast, according to the MPEG-4 standard, motion compensation is performed in each block of not only 16×16 pixels, but also in each block of 8×8 pixels. Furthermore, according to the H.264 standard, motion compensation in finer blocks of 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4 pixels are supported as shown in FIG. 12 to further improving the accuracy in the motion compensation, thereby increasing the overhead in reading.
Specifically, for example, FIGS. 13C and 13D illustrate motion compensation in a size of 4×4 pixels. Assume that the calculated initial read address of the reference image in the two-dimensional space corresponds to the boundary of addresses read from the reference image memory 5 (FIG. 13C). The read traffic generated in reading for generating a predictive image of 4×4 pixels has 16 bytes (4 addresses of 4 pixels), thereby causing no transfer including an invalid pixel in reading. On the other hand, assume that the calculated initial read address of the reference image in the two-dimensional space does not correspond to the boundary of addresses read from the reference image memory 5 (FIG. 13D). Read traffic of 32 bytes (8 addresses of 4 pixels) occurs in reading for generating a predictive image of 4×4 pixels, thereby causing much transfer including a pixel, which is invalid pixel in reading, to increase the overhead in reading as compared to the motion compensation in 16×16 pixels.
That is, decoding of a moving picture stream requires random access on a block-by-block basis from a given pixel position indicated by a motion vector in the reference image memory 5. This hinders efficient access to the reference image memory 5 depending on the pixel position, thereby increasing the overhead in the memory access. With a decrease in the size for motion compensation for generating a predictive image, the overhead in the memory access (read access) increases.
Specifications of the reference image memory such as the capacity, the bit width of a data bus, and the operational frequency are determined by the resolution and the frame rate of moving picture streams supported by a decoder, which are defined by levels in the standards such as the MPEG-2 and the H.264. The specifications of the reference image memory are determined on the assumption of a possible worst case in the scope of the standards. Therefore, small household cameras, etc. accepting HDTV moving pictures need to include a high-performance reference image memory, thereby causing difficulty in reducing costs and power consumption.
The worst case here is, specifically, the case where pictures in a moving picture stream, which have been subjected to inter-picture prediction encoding, are encoded as follows.                Access to a reference image memory for generating a predictive image is the transfer with greatest overhead including an invalid transfer pixel in all macroblocks in a picture.        The size for motion compensation in a macroblock is smallest (e.g., 4×4 pixels in the H.264 standard) in all macroblocks in a picture.        A B-picture capable of forward prediction, backward prediction, and bidirectional prediction is encoded by bidirectional prediction in all macroblocks in a picture.        
Japanese Patent Publication No. 2000-50272, Japanese Patent Publication No. 2000-78568, Japanese Patent Publication No. 2000-04440, and Japanese Patent No. 4384130 suggest techniques related to reduction in the bandwidth of a memory as solutions to the problem.
According to Japanese Patent Publication No. 2000-50272 and Japanese Patent Publication No. 2000-78568, the size of a decoded image is reduced by filtering, and then the image is stored in a reference image memory. An image enlarged by filtering the reduced-size image read from the reference image memory is used as a reference image.
Japanese Patent Publication No. 2000-04440 teaches storing an image, which is obtained by compressing the decoded image by Hadamard transform and quantization, in a reference image memory. An image, which is obtained by expanding the compressed image read from the reference image memory by inverse quantization and inverse Hadamard transform, is used as a reference image.
Japanese Patent No. 4384130 teaches adaptively controlling compression distortion, which is caused by performing irreversible transform processing such as scale-down and compression, in storing a decoded image in a reference image memory so that the compression distortion is not temporally accumulated in decoding subsequent pictures.