Recently, in a single system, the number of devices that employ a configuration in which a multi-core processor system having multiple cores is increasing. In the case of creating a program that runs on the multi-core processor system, an operation occurs that extracts parallelism in the program based on a conventional program for a single-core processor system having a single core in a single system. In particular, due to recent software complexity and increased scale, a program for the multi-core processor system is created by utilizing existing software resources instead of newly creating a program, thereby achieving a reduction in the number of steps required for program creation and for the verification. As a parallelism extraction method, a technique that extracts the parallelism through manual retrieval or a compiler (referred to as Prior Art 1).
By assigning the extracted parallel processing to cores through an application of Prior Art 1, the multi-core processor system can execute processing at a higher speed than a case of execution of the processing by a single core. Each of the cores executes a program in units of threads. Memory and devices utilized by threads are managed according to process, each of which has one or more threads. The process-to-thread properties are such that threads belonging to the same process share a memory space and threads belonging to different processes do not share any memory space.
Disclosed as a technique that capitalizes on the throughputs of the multiple cores is a technique that, when multiple application softwares (hereinafter, “apps”) are activated, executes the apps by separate cores to thereby capitalize on the parallelism effects of the cores (referred to as Prior Art 2). Further, disclosed as a technique for distributing load among multiple cores is a technique that periodically acquires the load quantity of cores to relocate threads assigned to the cores (referred to as Prior Art 3).
Disclosed as a determination technique of determining whether to perform parallel processing is a technique that determines whether another core can execute a parent thread's fork command and if so, causes a child thread to be executed by the another core. Disclosed as a technique used when executing a child thread by another core is a technique that copies the context area of a thread to a storage area managed by another core (see, e.g., Japanese Laid-Open Patent Publication Nos. 2003-29984 and H5-127904). The context area is an area that stores data used by threads such as CPU register values, program counters, and stack pointers.
Further, disclosed as a technique for preventing address collisions that occur when threads are migrated to other cores is a technique having a memory address space shared by multiple threads and a memory address space not shared by threads (see, e.g., Japanese Laid-Open Patent Publication No. H9-146904). In the technique of Patent Document 3, address conversion is performed in the case of memory access to the latter address space so that the address space can be arbitrarily switched, thereby enabling address collision to be prevented.
Child threads generated from a parent thread include a blocking thread in which a child thread operates in a state exclusive of operation of the parent thread, and a non-blocking thread in which a child thread and the parent thread operate independently of one another. In FIG. 15, operations of the blocking thread and the non-blocking thread are depicted.
FIG. 15 is an explanatory view of the operations of the blocking thread and the non-blocking thread. Reference numeral 1501 designates the operation of a blocking thread and reference numeral 1502 designates the operation of a non-blocking thread.
A process 1503 denoted by reference numeral 1501 includes a parent thread 1504 and a blocking child thread 1505. The parent thread 1504 and the blocking child thread 1505 access the same context area, area 1506. During the execution of the blocking child thread 1505, the execution of the parent thread 1504 is suspended and the blocking child thread 1505 updates data stored in the context area 1506. After the completion of the execution of the blocking child thread 1505, the parent thread 1504 inherits the updated context area 1506 and executes processing.
In this manner, the blocking child thread 1505 performs operations independent from those of the parent thread 1504, but a risk of concurrent access of the context area 1506 occurs if execution is performed concurrent with the parent thread 1504. Accordingly, the blocking child thread 1505 is executed in a state that completely excludes the execution of the parent thread 1504.
A process 1507 denoted by reference numeral 1502 includes a parent thread 1508 and a non-blocking child thread 1509. The parent thread 1508 and the non-blocking child thread 1509 access the same context area, area 1510. During the execution of the non-blocking child thread 1509 as well, the parent thread 1508 is executed and accesses the context area 1510 at a timing different from that of the non-blocking child thread 1509.
In this manner, the non-blocking child thread 1509 accesses the memory at a timing different from that of the parent thread 1508 and hence, concurrent execution is permitted. If concurrent access of the context area 1510 is possible and the parent thread 1508 and the non-blocking child thread 1509 are synchronized, code is inserted into the two threads to perform exclusive processing or synchronous processing using communication between the threads.
In the technique of Prior Art 1 of the prior arts, code for exclusive processing or synchronous processing is inserted into source code when parallel operation is to be performed. If performed manually, however, failure to insert code for exclusive processing or synchronous processing may occur, resulting in program defects. The parallelism extraction by a compiler may arise in a problem of difficulty in determining various operation states of software and selecting proper exclusive processing or synchronous processing.
In either method of manual insertion or compiler parallelism extraction, the generated execution object is altered from an execution object for a single-core processor system. Hence, due to little or no reduction in the number of verification steps, a problem arises in that an enormous number of verification steps may occur for an execution object for the multi-core processor system.
The technique of Prior Art 2 addresses a problem in that, with respect to a single app, the app is executed at the same speed as that of the single-core processor system and does not benefit from the multi-core processor system. In this case, another problem lies in that load concentrates at a single core irrespective of the presence of multiple cores. The technique of Prior Art 3 requires that threads can be migrated to other cores. If a child thread is a blocking thread, however, a problem arises in that the child thread may block the parent thread, making migration difficult.