1. Field of the Invention
The present invention relates, generally, to a method of executing parallel application on a manycore cluster system and the manycore cluster system and, more particularly, to a method of executing parallel application on a manycore cluster system based on a parallel computing framework, and to the manycore cluster system.
2. Description of the Related Art
In recent years, general desktop computers have also been mounted with a quad- or hexa-core central processing unit (CPU) for high-performance computing (HPC). Graphic processing units (GPUs) capable of performing general computations in addition to graphic processing have also had the capability to perform HPC using a compute unified device architecture (CUDA) or an open computing language (OpenCL). To use hardware for HPC in this way, a parallel programming model suitable for this should be used. OpenCL has recently been in the limelight as a representative parallel programming model.
OpenCL makes it possible to write programs operated on multi-platforms (for example, a plurality of CPUs or GPUs), and to expand the capacity of the graphic processing unit (GPU) to regions (general-purpose GPU) other than graphic processing. Since OpenCL can operate on various hardware produced by various hardware manufacturers, many manufacturers develop frameworks suitable for their own hardware on the basis of OpenCL.
As disclosed in Korean Unexamined Patent Application Publication No. 2009-0063122A (published on Jun. 17, 2009), managing a workload of a system is important for reducing a load of the system and improving a process speed.
However, an OpenCL application can be applied to only one node. Thus, in order to expand the OpenCL application to a cluster environment and distribute the workload, a message passing interface (MPI) for communication between devices and between nodes should be added to the OpenCL application. In view of a characteristic of parallel programming, the addition of the MPI to the OpenCL application can increase complexity of coding. Further, when compute devices in the manycore cluster system are different from each other, the workload should be distributed into the nodes of the manycore cluster system.
Thus, in order to distribute the workload created by execution of the OpenCL application in the cluster environment, much additional programming should be required. As a result, the productivity of programs and portability of OpenCL are reduced.
Accordingly, technology for solving the above-described problems is required.
The foregoing is intended merely to aid in the understanding of the background of the present invention, and is not intended to mean that the present invention falls within the purview of the related art that is already known to those skilled in the art.