1. Field of the Invention
The present invention relates to parallel processing, and more particularly, to parallel processing capable of improving reusability of a work sharing control block in dynamically generating the work sharing control block when a plurality of tasks are performed in parallel by a multi-thread.
2. Description of the Related Art
An Open Multi-Processing (MP) is an application program interface (API) for making a multi-thread parallel program in a shared memory environment, and is a set of compiler directives for representing shared memory parallelism.
Parallel programming using the OpenMP is realized by a compiler directive standard that inserts a directive into only a portion required for a sequential code. A user inserts an appropriate OpenMP directive into a portion required for parallelism, and then a compiler generates a multi-thread code on the basis of the inserted directive.
When a parallel section is specified in a program code by a compiler indicator, a master thread additionally generates a thread through an OpenMP runtime library call to enable parallel programming. A parallel section in which the generated threads perform tasks is divided into sub task sections, and each thread performs an allocated task, which makes it possible to perform work sharing in the parallel section.
Basically, the work sharing of the OpenMP has an implied barrier, which causes a thread that has completed tasks to wait until the other threads complete their tasks. When an instruction word “NOWAIT” is specified in the program, the implied barrier is removed, and each thread can immediately advance the next task without waiting.
Information required for work sharing may be specified in a control block in order for the threads to share the information. When the instruction word “NOWAIT” is specified in the program, a plurality of control blocks for managing a plurality of work sharing processes are needed.
For example, assuming that one parallel section has four NOWAIT work sharing regions, and four threads perform tasks in the regions, if, a first thread performs tasks in the first NOWAIT region and the other threads perform tasks in the second NOWAIT region, two control blocks are needed. In the same manner, when threads perform tasks in the corresponding work sharing regions according to a continuous series of NOWAIT instruction words, an additional control block is needed.
The number of control blocks that is required for each thread to perform work sharing until barrier synchronization occurs is determined by the number of divided tasks that have not been completed by all of the threads, which can be known only during a runtime.
That is, the number of control blocks can only be known during a runtime, which causes an unnecessary use of memory resources.
The following methods can be used to prevent the unnecessary use of memory resources due to the generation of control blocks: a method of statically arranging the control blocks to have the same size and managing the control blocks; a method of dynamically arranging the control blocks and managing the control blocks; and a method of statically arranging the control blocks to have the same size and dynamically adding the control blocks, if necessary. The first method may cause a memory over flow, and the second method has a problem in that all of the threads are synchronized to generate a new control block.
Therefore, a method of smoothly performing tasks in all work sharing regions while limiting the generation of the control blocks is needed.