1. Field of the Invention
The present invention relates to a decoding picture images compressed in the scheme of H.264 encoding, and more particularly to an H.264 decoder equipped with multiple operation units, which performs image decoding processes.
2. Description of the Related Art
It is usual that the development of an H.264 decoder has allowed for an operation model equipped with a single operation unit, such as a single-thread/single-core-based model. In this case, in a process for decoding compressed image data, operations should be performed in sequence. In order to improve an operation rate of an application program based on this kind of operation model, an operating frequency of a hardware processor, i.e., a relevant operation unit, must be raised. However, raising the operating frequency of the hardware processor not only causes the structure of the processor to become more complex, but also raise a rate of power consumption.
In order to solve the above problems, a multi-thread/multi-core based processor adopting multiple operation units, i.e., a plurality of processors, has been developed. At present, the structure of the multiple operation units is embodied in various kinds of forms. The most popular form of the multiple operation units is a combination of a general-purpose Reduced Instruction Set Computer (RISC) processor and a high-rate Digital Signal Processor (DSP). In the case of using processors of this form, the general-purpose RISC processor takes charge of usual types of operations, and the DSP is used in a case of requiring high-rate operations such as operations in multimedia.
In another case of the structure of the multiple operation units in this category, there is a structure of connecting the same kinds of processors in parallel in the form of arrangement. The structure of the multiple operation units of this type adopts an operation model in which each unit processor is assigned an operation suitable for each unit processor to separately perform an operation, or in which operations are performed in the form of pipeline while processors are being connected with one another.
A method of the form of pipeline, in which many processors execute application programs having a large quantity of operations in cooperation with one another has merit in that the operation method can maximize the performance of a processor equipped with multiple operation units. Nevertheless, besides merit, this method of the form of pipeline has a drawback in that the application programs should be divided into several operation blocks. In addition, in order to efficiently carry out an operation according to the operation method of the form of pipeline, the amount of operations of the operation blocks should be similar to one other, and there should be no relation among the blocks. Still, in a case of an H.264 decoding process, the amount of operations is different from one another, and there exists a part which needs a sequential operation among each operation, which incurs a problem in that it is difficult to apply this operation model of the form of pipeline to the H.264 decoding process.
It is usual that an operation block of the H.264 decoding process can be classified into four parts, such as Variable Length Decoding (VLD), Inverse Transform & DeQuantization (ITDQ), intra PREDiction & inter PREDiction (PRED), and Deblocking Filter (DF). The VLD signifies an operation for analyzing a construction of video data (i.e., bit streams) transmitted in a state of compression. The ITDQ denotes an operation for inversely transforming and dequantizing values of coefficients of received macroblocks. The PRED represents operations for making predictions according to an intra mode on the basis of images of the present frames, and for making predictions according to an inter mode on the basis of images of the previous frames. The DF implies an operation for reducing the degradation in quality of a blocking picture of a boundary line among macroblocks. In addition, each operation block has a different amount of operations and it is usual that in each H.264 decoder operation block the ITDQ operation has the smallest amount of operations, while the amount of required operations increases in order of the VLD operation, the DF operation, and the PRED operation. Since the ITDQ and PRED operations can be performed until after the VLD operation is completed, a sequential operation is needed among these operation blocks.
Therefore, in order to apply the operation model of the form of pipeline to the H.264 decoder according to the structure of conventional multiple operation units, a method in which performing an operation proceeds to the next during synchronization among the operation units, centers around an operation unit that performs the VLD operation, is used. In other words, an operation is performed in the way that the operation unit which performs the VLD operation delivers an instruction to another operation unit. For example, when the VLD operation of the first macroblock of a first operation unit has been completed, the first operation unit delivers a signal, which is necessary to perform the ITDQ and PRED operations on the first macroblock, to a second operation unit. Then, while the second operation unit performs the ITDQ and PRED operations on the first macroblock, the first operation unit performs the VLD operation of the second macroblock. Next, when the VLD operation of the second macroblock is completed, the first operation unit delivers a signal, which is necessary to perform the ITDQ and PRED operations on the second macroblock, to a third operation unit. If the VLD operation, the ITDQ operation, and the PRED operation are performed with respect to all macroblocks in this manner, the DF operation is performed.
One operation unit basically carries out the DF operation related to one macroblock. At this time, since a specific macroblock whose DF operation is to be performed is influenced by a result of the DF operation of a macroblock just above the specific macroblock and by a result of the DF operation of another macroblock arranged on the right side of the macroblock just above the specific macroblock, it should be noted that the DF operation cannot be performed with respect to the specific macroblock until after the DF operation has been completed with respect to the other two macroblocks. For instance, in a case of decoding images of a Quarter Common Intermediate Format (QCIF) size as in TABLE 1, in order to carry out the DF operation with respect to the first macroblock (the 23rd macroblock, for example) of a row of the third macroblock, the DF operation should be completed with respect to the first macroblock (12th macroblock) and second macroblock (13th macroblock).
TABLE 11234567891011121314151617181920212223242526272829303132333435363738394041424344
FIG. 1 is a view illustrating an example of an operation of the form of pipeline decoding picture images compressed in the format of H.264 in the H.264 decoder having the structure of multiple operation units (threads). A usual amount of operations required in the four H.264 decoder operation blocks is normalized to be displayed on the basis of the amount of operations required in the VLD operation in FIG. 1.
A conventional decoding method according to the form of pipeline of the structure of multiple operation units in this manner, has merit in that the method can be simply embodied, but the method can be inefficient in a case where the number of operation units used is small. This is why one operation unit performs only the VLD operation as illustrated in FIG. 1 in a conventional H.264 decoder equipped with multiple operation units. Therefore, waiting time is incurred for VLD operation of the next macroblock to be performed, until after the VLD operation of a prior macroblock has been completed. Moreover, because a case where a video recorder is embodied by assigning a plurality of operation units to the video recorder is rare, a scheme for maximizing an overall performance by efficiently using resources is required in the actual circumstances, even though a small number of operation units is used.
Moreover, since other operation units operate according to a synchronization signal of the operation unit for performing the VLD operation in the current method, the role of the operation unit for performing the VLD operation becomes very important. In other words, because the VLD operation unit takes charge of synchronization of the overall operation units. After all, as the VLD operation unit requires a large quantity of operations, an operation unit having a high operating frequency is used. Still, as described above, since the operation units having a similar level of operating frequency are used in the structure of multiple operation units, it is difficult to assign a high operating frequency only to the operation unit which takes charge of the VLD operation. Hence, it is necessary to distribute the operations required for the synchronization over all operation units of the structure of multiple operation units.