1. Field of the Invention
The present invention relates to a method of compressing a video image, and more particularly to a decoding device and a decoding program for video image data in which a decoding processing is conducted, in a high speed, on a bitstream being encoded by using interframe prediction method on a processor system including a cache memory.
2. Description of the Related Art
In recent years, video image data is expressed in digital data and is used for such purposes as an image distribution service by being stored in digital media or via network communications, digital television broadcastings using satellites, or the like. Generally, a video image includes a large data amount. Therefore, a video image is compressed in accordance with a compression encoding technique in order to reduce the data amount, and is decoded for the reproduction of the video image. In MPEG (Moving Picture Expert Group) system as an international standard for video image compression protocol, video image data is encoded into a bitstream.
FIG. 1 explains a schematic of interframe prediction method. A frame image is divided into units of macroblocks of rectangular regions each having a size of 16 pixels vertically and 16 pixels horizontally. An interframe prediction is conducted in each unit of macroblocks. In an interframe prediction method, an offset value is encoded which value is obtained by subtracting each pixel value of a macroblock for reference image from a pixel value of each pixel of an input macroblock when the video image data is encoded.
A macroblock for reference image herein unit image data of a region specified by motion vector information, which is extracted, in a size of a macroblock, out of an input image block. The motion vector information is given to a decoder together with the offset value. In the decoder, a decoded image macroblock can be obtained by synthesizing the offset value and a pixel value of a macroblock for reference image.
FIG. 2 explains the above decoding processes in other words, a motion compensation processing. FIG. 3 is a flow chart of the motion compensation processing of a conventional example. Firstly having a focus on the flow chart of FIG. 3, encoded bitstream, i.e. a bitstream which has been encoded and transmitted is analyzed in step S1. In step S2, a motion vector is obtained. In step S3, a position of a reference macroblock on a reference frame corresponding to the above motion vector is determined. In step S4, a preload of a reference macroblock is conducted. This preload is conducted in order to load image data of a reference macroblock from an external memory when the image data of the reference macroblock is not stored in a cache memory in a processor. Generally a video image includes a very large data amount and it is rare that pixel data corresponding to all of the pixels in a reference macroblock are stored in a cache memory. This is the reason why an order for a preload is issued at the above timing in order to avoid the decline of performance due to a caching failure.
Next, in step S5, an offset value between the macroblocks, in other words, the offset value between the macroblock for input image and the macroblock for reference image, as described in FIG. 1, is obtained from a bitstream. In step S6, a decoded macroblock is created by synthesizing the pixel data of the offset value and the pixel data of the reference macroblock. In step S7, it is determined whether or not the decoding processes for all of the macroblocks on the frame are completed and if the processes are not completed, the processes to be conducted in step S2 and the following steps are repeated for the macroblock in a next position to be decoded. When it is determined that the decoding processes for all of the macroblocks on the frame are completed, the decoded frame image is output in step S8 and the processes are completed.
As described above, decoding processes of a video image requires a great deal of calculations due to a very large amount of data included therein. However, recently, also various types of devices incorporating processors are required to be able to reproduce a video image. In order to reproduce a video image, it is desirable that a cache memory is provided and image data of the reference frame is stored in that cache memory as much as possible prior to the decoding of the image data. However, the size of the reference frame image is much larger than that of the cache memory and also, there are various limitations about increase of the size of the cache memory in view of the requirement for a lower cost and a lower power consumption of the devices.
As described above, because there are limitations on a size of a cache memory, the occurrence rate of cache miss is raised, therefore, there is a problem that memory access performance declines due to a prolonged time for loading of data to an external memory, a main memory, or the like in case that a cache miss occurs. Especially in a video image data decoding device employing a motion compensation technique, there is a problem that the decoding processes in a higher speed can not be realized due to a decline of memory access performance caused by the occurrence of a cache miss, because such a decoding device accesses to the reference frame image frequently.
In order to avoid the decline of access performance due to a cache miss, a preload processing is conducted in order to store data of a reference macroblock in a cache memory at a time when a position of a macroblock is determined as described above. Because a processor can conduct other processes after making an order of preload, a time for data accessing can be free of a consciousness by a user.
However, even in a configuration that such a preload is conducted, there is a problem that a necessary preload of a region is not conducted or a loaded region is again preloaded prior to the region not preloaded, depending on a position of a reference macroblock, leading to a decline of memory access performance.
FIGS. 4 and 5 explain a problem of such a conventional preloading method. FIG. 4 explains a case that when a cacheline boundary is included in a line of a reference macroblock and a reference macroblock extends over the cacheline, a cache miss is caused because a region beyond the cacheline is not preloaded. Specifically, in FIG. 4, because a preload specifying address is set to a front address of a line of a reference macroblock (reference rectangular region), while region “A” as a region not beyond the cacheline is preloaded, region “B” which is beyond the cacheline is not preloaded, therefore, a cache miss is caused after the preload, leading to a decline of memory access performance.
FIG. 5 explains a case that a preloaded region is again preloaded prior to the region not preloaded. In FIG. 5, a macroblock depicted by a dashed line shows a reference macroblock referred to upon decoding a macroblock immediately prior to the current macroblock, and as for the lines included in the immediately prior macroblock, the portions up to the cacheline boundary is already preloaded. Accordingly, in FIG. 5, region “A” is a region shared duplicately by the immediately prior macroblock and the current macroblock, and region “B” is a region which is already preloaded at a time of the preload of the immediately prior macroblock, although the region “B” is not duplicately shared.
To the contrary, region “C” is a region which has to be preloaded at a time of the preload of the current reference macroblock. Because, a preload of a reference macroblock is conventionally conducted in the order of from a higher line to a lower line, a preload is conducted from line (1) in FIG. 5. Accordingly, the preload of lines (3) and (4) which actually have to be preloaded is conducted posterior to the preload of lines (1) and (2). And when a time for other processes conducted in parallel to the preload is short, a preload processing is completed with lines (3) and (4) remaining not preloaded. This situation results in an occurrence of cache miss and may cause the decline of memory access performance.
There are following prior arts about the above described decoding of video data employing a motion compensation technique or a control of cache memory related to such a system.    [Patent literature 1] Japanese Unexamined Patent Application Publication No. 4-170654 “Cache Memory Controlling System”    [Patent literature 2] Japanese Unexamined Patent Application Publication No. 6-180669 “Cache System”    [Patent literature 3] Japanese Unexamined Patent Application Publication No. 11-215509 “Motion Compensation Processing Method and System, and Storage Medium for Storing Processing Program thereof”
In patent literature 1, a cache memory controlling system is disclosed in which when a data length for one line in one image data is constant and the data length for one line for each image data is not constant, a data length for one line is specified as a data length of one entry of a cache memory for each image data so that a hit ratio of the caching is increased.
In patent literature 2, a technique is disclosed in which, in case of cache miss, data which is not successive on a main memory but positioned around the cache-missed data on the frame is stored in a cache memory so that a hit rate is improved.
In patent literature 3, a technique is disclosed in which data of a reference region is preloaded to a cache memory, corresponding to an address of a region next, in right, to the reference region specified by a motion vector so that a hit rate is improved upon extension of the next macroblock in order to realize a motion compensation in a higher speed.
However, these techniques do not solve a problem that a region beyond a cacheline boundary is not preloaded as explained in FIG. 4, or a problem that an already preloaded region is preloaded prior to the region not preloaded as explained in FIG. 5.