1. Technical Field
The present invention relates generally to an apparatus and method for predicting performance attributable to the parallelization of hardware acceleration devices and, more particularly, to an apparatus and method for predicting performance attributable to the parallelization of hardware acceleration devices, which are capable of predicting the degree of improved performance when a task is performed in a parallel manner using hardware acceleration devices in a real-time processing system. That is, the present invention relates to an apparatus and method for predicting performance attributable to the parallelization of hardware acceleration devices, which are capable of predicting central process unit (CPU)-based software execution a versus the degree of improved performance when the execution of software is parallelized using general-purpose multi-core hardware acceleration devices, such as field programmable gate arrays (FPGAs), many integrated cores (MICs), or graphics processing units (GPUs).
2. Description of the Related Art
A conventional real-time data processing system employs a method of processing software based on a central processing unit (CPU). As the use of real-time data has explosively increased recently, a problem arises in that the performance of processing real-time data is reduced when only a conventional real-time data processing system is used.
In order to solve this problem, a real-time data processing system adopts a method of parallelizing software processing using hardware acceleration devices, such as FPGAs, MICs, or GPUs. For example, Korean Patent No. 10-0463642 (entitled “Apparatus for accelerating Multimedia Processing using Coprocessor”) discloses an acceleration apparatus for performing multimedia processing in order to improve the processing performance of multimedia by performing parallelization processing.
A real-time data processing system using hardware acceleration devices contrives to improve overall performance in such a manner as to modularize the parts of the entire task that can be easily parallelized and then execute the modules through parallel processing using the hardware acceleration devices. That is, the parts that should be sequentially performed are performed by a high-speed CPU using a software method, and the parts that can be parallelized are performed by the hardware acceleration devices.
It is, however, difficult to predict an improvement in performance (i.e., acceleration) attributable to parallelization if an appropriate level of parallelization is not performed by hardware acceleration devices because a CPU typically has the fastest processing speed among hardware elements. In other words, the time it takes to perform the entire task may become faster or slower depending on the number of modules of the entire task that have been parallelized and the degree of parallel processing that is performed by the parallelized modules.
In order to improve task performance, it is possible to perform parallel processing using as many hardware acceleration devices as possible. In this case, problems arise in that the hardware acceleration devices occupy an excessively large area and high expenses are incurred.
A conventional parallelization processing method using hardware acceleration devices will now be described. A system designer determines parts of the entire task to be parallelized based on his or her experience. The system designer determines a task flow so that the parts determined to be parallelized are performed by the hardware acceleration devices and the remaining parts are performed by a CPU. A real-time data processing system actually performs data processing in accordance with the determined task flow, and the system designer checks the degree of improved performance attributable to parallelization based on the actual data processing of the real-time data processing system.
This conventional method greatly depends on the experience of a system designer in order to determine a task flow required for the parallelization processing of the task, and requires a lot of time to form parallelization processing using the hardware acceleration devices.
However, the performance of parallelization processing using hardware acceleration devices can be determined only when actual implementation has been completed. Accordingly, if existing hardware acceleration devices are designed again or replaced with new hardware acceleration devices because of design errors or an insufficient improvement in performance, a problem arises in that a development period increases because the stages from a design stage to a final implementation stage should be repeated. That is, although a lot of time and effort should be invested in order to construct parallelization processing using hardware acceleration devices, an expected effect cannot be previously predicted, and an actual test can be performed and the performance of parallelization processing can be measured only in the final stage. As a result, problems arise in that a lot of time is required to check the performance of parallelization processing and a lot of time and effort are repeatedly wasted in order to modify the configuration of parallelization processing and to measure the performance.