In one conventional form of parallel processing in a multicore processor, one process is defined as a parent process such that a core executing the parent process causes another core to asynchronously execute a child process. The core executing the child process notifies the core executing the parent process of a result of the child process when completing the process, and the core executing the parent process uses the result to continue the process. Since the communication between parent and child is limited to the timing of activation and termination of the child, such a form of parallel processing is suitable for a multicore processor system having no coherency mechanism for caches between cores and a sparsely-connected multicore processor such as that without a shared memory.
When one core executes only one parent process at a time while the other cores execute only child processes instructed from the parent process, the other cores can be controlled from the parent process. This operation is suitable for implementing the parallel processing in an asymmetric multicore processor that is a multicore processor with processors having different capacities and a multicore processor system not equipped with an OS compatible with multicore processors. Particularly, in the field of embedded devices, since multiple processes executing parallel processing are still rarely activated at the same time and can be implemented with simple hardware, this form of parallel processing requiring no OS compatible with multicore processors is extremely suitable.
A multicore processor can efficiently be operated by predicting an estimated time for completing a process and utilizing the estimated time in a method of controlling the other cores described above. For example, a technique is disclosed that collects predicted times of termination for all the tasks from other cores so as to determine a core to which a process is allocated based on the collected predicted times (see. e.g., Japanese Laid-Open Patent Publication No. H9-160890).
In another technique utilizing an estimated time, for example, a delay of hardware or software is predicted in a system requiring a real-time property and a timer is set in consideration of the predicted delay time. A technique is disclosed that enables packet transmission within a processing request time by transmitting a packet when an interrupt is generated by the timer taking the delay time in consideration (see. e.g., Japanese Laid-Open Patent Publication No. 2001-156842).
However, in the conventional techniques described above, a core completing a child process notifies a core executing a parent process of the completion of the child process or the result of the child process through inter-core communication. The notified core executing the parent process interrupts the parent process to execute a process corresponding to an interrupt, a reception process for the notification, and a process to return to the parent process, etc. Consequently, a problem of an overhead generated by the interruption and restart of processing arises. Since a given process intervenes during another process, the contents of a cache memory are rewritten and changed to contents of given process and the cache hit rate decreases at the time of return to the parent process, resulting in a problem of reduced processing efficiency.
If the number of cores increases and more child processes are executed, the problems described above become more prominent when the frequency of communication increases in proportion to the number of the child processes. As the number of child processes increases, the parent process is frequently blocked by communication from the child processes, resulting in a problem reduced processing efficiency of the core executing the parent process.