1. Field of the Invention
Methods and apparatuses consistent with the present invention relate to multimedia decoding, and more particularly, to efficiently designing a multimedia decoding system and an efficient decoding apparatus based on a multi-core platform.
2. Description of the Related Art
H.264 data has high capacity and is operationally intensive, and thus an efficient method of decoding the H.264 data in an asymmetric multi-core platform has been suggested. However, unlike Moving Pictures Experts Group 2 (MPEG-2) data that can be processed in slices, the H.264 data has mutual dependency between different frames or in the same frame and thus a decoding apparatus based on a multi-core platform may not easily process the H.264 data in parallel.
Related art partitioning methods for processing the H.264 data in parallel includes a data partitioning method that partitions data to be processed by processors and a functional partitioning method that partitions operations of operation modules as in a pipeline method.
FIG. 1 is a diagram illustrating an example of a multi-core platform system using a functional partitioning method according to the related art.
In the functional partitioning method, the multi-core platform system includes a plurality of processors and certain functions are allocated to the processors. For example, the multi-core platform system may include first through fourth processors 110, 120, 130, and 140. A data reading function 112, a pre-processing and initializing function 114, and a data storage function 116 are allocated to the first processor 110, and an entropy decoding function 122 is allocated to the second processor 120. An inverse transformation and inverse quantization function 132 and an intra prediction and motion compensation function 134 are allocated to the third processor 130, and a deblocking function 142 is allocated to the fourth processor 140.
If operation loads of the first through fourth processors 110, 120, 130, and 140 are not equal, the functional partitioning method may not guarantee a predetermined performance. In more detail, processing times 150, 160, 170, and 180 of the first through fourth processors 110, 120, 130, and 140 are different from each other and thus a critical path corresponding to an excess processing time 190 is created in the multi-core platform system due to the processing time 170 of the third processor 130 which is the longest processing time. Accordingly, data may not be efficiently processed in parallel and usability of the multi-core platform system may be reduced.
FIGS. 2A and 2B are diagrams illustrating examples of a multi-core platform system using a data partitioning method according to the related art.
Referring to FIG. 2A, according to an example of the data partitioning method, a frame 200 is partitioned into first through third slices 210, 212, and 214 which are respectively processed by first through third processors 220, 230, and 240. That is, each of the first through third processors 220, 230, and 240 performs a whole decoding method. In the data partitioning method, simple data processing may be efficiently performed in parallel. However, if mutual dependency exists between pieces of data, the data partitioning method may not be easily performed and an additional operation is required to solve a problem in terms of the mutual dependency so that the performance of the data partitioning method is greatly reduced. Thus, the data partitioning method is appropriately performed in an H.264-based decoding system that requires intra prediction as well as inter prediction.
Referring to FIG. 2B, according to another example of the data partitioning method, a frame 250 is partitioned into first through third slices 260, 270, and 280 which are processed by the first through third processors 220, 230, and 240, respectively. In FIG. 2B, correlations between data sizes and operation loads of the first through third slices 260, 270, and 280 may not be easily predicted. That is, each of the first and third processors 220 and 240 which respectively process the first and third slices 260 and 280 has a small operation load, while the second processor 230 that processes the second slice 270 has a large operation load. In more detail, due to different sizes of the first through third slices 260, 270, and 280, the operation loads are not equally allocated to the first through third processors 220, 230, and 240 and thus resources may not be efficiently utilized.
Also, if mutual dependency exists between pieces of data, a parallel processing structure may not be easily implemented and much processing time may be required. Furthermore, each core has to have data on whole operations, while local resources of the multi-core platform system are restrictive, which causes inefficiency.
FIGS. 3A and 3B are diagrams for describing differences of data-instruction distributions in accordance with characteristics of applications according to the related art.
When an operation is performed, both data and instructions are required. As in a memory 300 illustrated in FIG. 3A, when the size of data 310 is small, while the size of instructions 320 is large, a functional partitioning method is more advantageous than a data partitioning method. On the other hand, as in the memory 300 illustrated in FIG. 3B, when the size of data 330 is large, while the size of instructions 340 is small, the data partitioning method is more advantageous than the functional partitioning method.
However, data-instruction characteristics may vary in accordance with characteristics of applications and may also vary in accordance with modules of a program. Accordingly, if a partitioning method of multiprocessors for processing data in parallel is determined to be only one of the functional partitioning method and the data partitioning method, the data may not be flexibly processed in accordance with the data-instruction characteristics.
Furthermore, in a single-core based H.264 decoding system, the size of instructions is 820 kilobytes (KB) and the size of data is 200 KB, while the size of a local memory is only 256 KB. Thus, the H.264 decoding system may not be efficiently implemented by using restrictive resources of the local memory.