The present disclosure relates to a parallel video processing apparatus and method for a multicore computing system, and more particularly to, a technology for improving a processing speed of each core and reducing power consumption in a computing system having a plurality of cores by deriving a video unit size to be allocated to each core, in consideration of core performance and computational complexity of a video unit to be processed, and segmenting and allocating an input image screen to corresponding cores according to the derived video unit size.
As the requirements of low power consumption and high performance of CE devices have recently increased, the necessity of a multicore system is increasing. Such a multicore system includes a symmetric multi-processing (SMP) system having a plurality of identical cores and an asymmetric multi-processing (AMP) system having various heterogeneous cores that may be used as a general purpose processor (GPP) such as a digital signal processor (DSP), a graphic processing unit (GPU), or the like.
In order to improve performance by in parallel executing software, which processes a large amount of data, in multiple cores, entire data to be processed is segmented and the segmented data are allocated to respective cores so that the data are processed by the respective cores.
For example, in the case where data to be processed is video data, as illustrated in FIG. 1, the video data is segmented by various video units of a single image screen, and then threads for processing the segmented video data are allocated to respective cores so as to be processed.
FIGS. 1A to 1D exemplarily illustrate a variety of typical video units. Video data is processed in parallel by using a tile technique in the case where left and right images are processed in a head-mounted display (HMD) as illustrated in FIG. 1A, or in the case where independent images which are grouped to form a single screen are processed as illustrated in FIG. 1B, or in the case where a single screen is segmented into a plurality of tiles so as to be processed individually as illustrated in FIG. 1C.
Meanwhile, in the case where a screen including a plurality of slices is processed as illustrated in FIG. 1D, video data is processed in parallel by using a wavefront technique, wherein different numbers of slices are allocated to cores.
However, in the case of a computing system having an asymmetric multicore structure, it is difficult to predict an execution time of a video unit for each core due to different performance and computational characteristics of the cores, and thus it is difficult to efficiently allocate threads for processing video units to asymmetric multiple cores.
Accordingly, the present invention proposes a method for efficiently allocating video units to respective cores in consideration of a processing speed of video data and power consumption, by segmenting the single image screen into video units having locations and sizes according to asymmetric performance of cores and allocating the video units to cores matched to segmented video unit sizes.