In a multi-core processor system or a single processor system equipped with a multi-task operating system (OS), multiple processes or threads can operate simultaneously.
For example, a case is assumed where a thread #0 and a thread #1 simultaneously attempt to add 1 to a variable x in memory. This process of adding 1 to the variable x is performed as three operation steps by a central processing unit (CPU). The operation steps include reading the value of the variable x from the memory, adding 1 to the read value, and writing the result of the addition to the variable x in the memory.
FIG. 14 is an explanatory diagram of an example of access of shared resources in a multi-core processor system. It is assumed that in the multi-core processor system, for example, the thread #0 and the thread #1 are executed simultaneously at a CPU #0 and a CPU #1, respectively. At time t1 when the thread #0 reads the value of the variable x from the memory according to a Load command, the value of the variable x is 0. The read value of the variable x is stored to a register R1 in the CPU, and 1 is added to the value in the register R1 at time t2, then the result of the addition is stored to the register R1. At time t2, the thread #1 is executed at the CPU #1 and the thread #1 reads the value of the variable x according to a Load command, similarly as the thread #0. Because the value of the variable x is 0 at this moment, 0 is stored in the register R2 of the CPU #1.time t1:R1=x=0time t2:R1=R1+1=1, R2=x=0
At time t3, the thread #0 writes the value in the register R1 to the variable x in the memory. At time t4, the thread #1 then writes the result of adding 1 to the value in the register R2, to the variable x. This result of addition 1 is 1 and thus, 1 is written to the variable x again.time t3:x=R1=1, R2=R2+1=1time t4:x=R2=1
In this manner, when the thread #1 reads the value of the variable x from the memory before the thread #0 writes to the variable x, the result of adding 1 to the value of the variable x, the resulting increment of the value of the variable x is 1 despite the fact 1 is added twice to the variable x. This problem occurs also in a single processor system running a multi-task OS because multiple processes or threads are executed through a time-sharing procedure in such a system.
When shared resources, such as data in a share memory and a hardware device for direct memory access (DMA), are used by multiple processes or threads, exclusive access control has to be performed. Under exclusive access control, an exclusion flag placed in the memory is updated, using a special command, such as swap command and exclusive load/store command, provided by the CPU, in order to manage information on whether shared resources associated with the exclusion flag is being used.
In data writing by a special command, if multiple CPUs simultaneously attempt to write values to the same flag, only one CPU manages to succeed in writing-in. After a value is written to the flag according to the special command, only the thread having checked the result of the special command and written a value indicative of the shared resources being used to the exclusion flag acquires access right to the shared resources to use. In the case of the single processor system, causing the OS to update the exclusion flag makes possible performing exclusive access control without using the special command. After finishing using the shared resources, the thread having acquired the access right writes a value indicative of release of the resources to the exclusion flag at time t2 and thereby, declares the end of use of the shared resources.
A thread having failed to acquire the access right must wait until the thread having acquired the access right finishes using the shared resources and releases the access right. Spinlock and trylock are known as two methods of causing a thread having failed to acquire the access right to wait until release of the access right.
FIG. 15 is an explanatory diagram of an example of spinlock. As indicated in FIG. 15, in spinlock, a thread having failed to acquire access right repeats an exclusive access right acquiring process until the value of the exclusion flag is replaced with a value indicative of the shared resources being released. In spinlock, when completing use of the shared resources, a thread having acquired the access right through exclusive access control writes a value indicative of the shared resources being released to the exclusion flag and thereby, declares the completion of use of the shared resources.
In FIG. 15, because the thread #1 has acquired the exclusive access right first at time t1, the thread #0 fails to acquire the exclusive access right at time t2 and therefore repeatedly attempts to acquire the exclusive access right. The thread #1 releases the exclusive access right at time #3. This allows the thread #0 to successfully acquire the exclusive access right at time t4 immediately after the release of the exclusive access right.
FIG. 16 is an explanatory diagram of an example of trylock. As indicated in FIG. 16, in trylock, execution of a thread having failed to acquire access right is suspended until the access right is released. In the case of trylock, when use of shared resources is completed, the value of the exclusion flag is changed to “released” and release of the access right is communicated to the OS. When the access right is released, the OS resumes execution of the suspended thread.
In FIG. 16, the thread #0 attempts to acquire exclusive access right at time t1 but the exclusion flag is already set to “under use” by the thread #1. Consequently, the OS changes the state of the thread #0 to a stand-by state, which brings the exclusive access right to a different thread (thread #1). Afterward, when the thread #1 releases the flag, the OS changes the state of the thread #0 back to an executable state.
In spinlock, the exclusion flag is repeatedly checked until the shared resources are released. As a result, if the shared resources are not released for a long period, the processing capacity of the CPU may be used wastefully. In the case of trylock, execution of a thread is suspended until the shared resources are released. As the thread is kept suspended, a different thread is executed or, if no executable thread is found, the CPU is suspended. For this reason, wasteful use of the processing capacity of the CPU does not happen. However, thread suspension/resumption by the OS creates overhead. In addition, when multiple threads are operating, it is not guaranteed that the thread released from its suspended state can be executed immediately. Hence, overhead arises during a period between the point of time of the shared resources becoming available and the point of time of the start of actual use of the resources.
It is therefore preferable that an exclusive access control process by the spinlock method be performed for shared resources that are released in a short period, while an exclusive access control process by the trylock method be performed for shared resources of which release takes a long time. However, the stand-by time until the release of the shared resources varies depending on the state of the resources and the number of threads in use, and is therefore difficult to determine in a unique manner.
According to a known technique, stand-by times until the release of exclusive access right are recorded as statistical information so that a proper exclusive access control method is adopted at the execution of a thread (see, e.g., Japanese Laid-Open Patent Publication No. 2001-84235). A technique of recording statistical information of stand-by times until the release of exclusive access right at the execution of a thread is also known (see, e.g., Japanese Laid-Open Patent Publication No. H10-312294), and another technique of switching the exclusive access control method at the execution of a thread is also known (see, e.g., Japanese Laid-Open Patent Publication No. H11-85574).
The conventional techniques, however, pose a problem in that the cost of calling the OS for use in exclusive access control is not considered. A process of simply checking the exclusion flag in spinlock is usually realized by several commands, in which case, with consideration of memory access cost, it takes several ten to hundred nanoseconds to check the exclusion flag once.
When the exclusion flag is updated by the OS, however, the OS must be called to check the exclusion flag. Calling the OS from an application program requires a process of shifting the OS to a privileged mode using an interruption command, etc. and therefore, takes several ten to hundred microseconds, which is far greater than the time required for the flag check process.
FIG. 17 is an explanatory diagram of calling costs resulting from use of the OS. As indicated in FIG. 17, when a thread checks the exclusion flag (time t1) and immediately thereafter, another CPU releases the flag, if the function of the OS is not used (as indicated by the portion on the left), the thread acquires the flag at time t2 and can start a process using the shared resources. If the function of the OS is used (as indicated by the portion on the right), however, the OS cancels interruption first and then returns to the process on the thread. When the thread makes another exclusive access right acquisition request, the OS takes over the process to acquire exclusive access right, and then returns to the process on the thread again. In this manner, in the case of using the OS, the process is alternately executed at the OS and the thread. As a result, the thread cannot start the process using the shared resources until time t2′.
For example, with the techniques above, if the OS switches between exclusive control processing using an attempt log and exclusive control processing using a spin log, a problem arises in that overhead by the OS arises for the alternate execution of the process at the OS and the thread.