1. Field of the Invention
The present invention relates to parallel processing in a computer using threads which share a task address space and in which each thread has a separately assigned stack, and more particularly to a method and an apparatus for managing globally declared data privately for each thread.
2. Description of the Background Art
As a model for representing parallel processing in a computer effectively, there is a model in which a usual abstraction of a process is split into two components: the task and the thread.
In this model, the task defines the static execution environment such as an address space, while the thread defines the dynamic execution environment such as a program counter.
In such a model, it is possible for any number of threads to be present simultaneously within a given single task. In other words, it is possible for multiple threads of control to be executed simultaneously by using the same data in the given task.
The use of this model employing the concepts of tasks and threads has the following advantages over the usual parallel execution of multiple processes.
(1) The communication between threads can be made at high speed because it is achieved through shared data.
(2) The thread is assigned a relatively small number of resources compared with a process, so that the overhead due to generation and termination is less for threads than for processes.
In such a model using tasks and threads, the address space is shared among the threads, so that the global data are shared by all the threads.
For example, the data used in a program written in the programming language C can be classified into two categories: global variables and auto variables. In these two categories, the global variables, which correspond to the global data to be shared among the threads in the above described model, are usually allocated in the data space as they are to be accessible from a plurality of functions, while the auto variables are usually allocated on the stacks as they are declared within a function and valid only within that function.
Now, in the multi-threaded environment in which multiple threads are present in a single task, it is necessary for each thread to support globally declared data privately. As an example of such a multi-threaded environment in which thread private data are required, a multi-threaded server will be described below.
In this example, the multi-threaded server implements an environment in which multiple threads that perform realizing the same service are present within a single task, and a plurality of service requests from a plurality of servers can be executed simultaneously. This multi-threaded server can be realized by the following program in the C programming language, for example.
______________________________________ struct msg m; get.sub.-- job( ){ -- receive (&m) -- job( ); } ______________________________________
In this case, in order for each thread to receive the service request, each thread calls the "get.sub.-- job" function first, and calls the "receive" function in the "get.sub.-- job" function, and then receives the message into the variable "m". This message variable "m" is usually going to be processed by a plurality of functions, so that it is necessary to allocate this variable globally. However, under the multi-threaded environment, globally declared data are shared among the threads so that there is a possibility that the message will be destroyed in this program. For example, when a plurality of service requests arrive, the request content of these service requests are stored as the message variable "m", but this message variable "m" is shared among the threads, so that the message that arrived later will overwrite the message that arrived earlier. Therefore, there is a possibility that the message that arrived earlier will be destroyed by the message that arrive later.
For this reason, it is necessary in such a multi-threaded environment to employ a specially devised data management method for supporting global data privately an a per thread basis.
Conventionally, the following two methods have been proposed as specially devised data management methods.
(I) Method for allocating private data on stack dynamically.
In this method, thread private data are allocated on the stack by declaring data used in several functions as auto variables in a certain function that is executed by each thread. Here, however, when the allocated data are to be used in other functions than the function that declared the auto variable, it is not possible for the other functions to directly access the allocated auto variables. There arises a need for handing the address of the allocated variable as a function argument from the calling function to that called function.
An exemplary program written in the C programming language of adopting this method is shown in FIG. 1. In this program of FIG. 1, after each thread is generated, each thread starts the execution from the function "start". Moreover, in order to support the private data "private.sub.-- data" for each thread, the private data are declared within the function "start" and allocated on the stack. When calling the functions "func1" and "func2" from the function "start", the private data "private.sub.-- data" cannot be accessed from either "func1" or "func2", so that there is a need to pass the address of this private data from the function "start" to the functions "func1" and "func2" as a function argument.
Thus, in this conventional data management method for supporting private data for each thread, there is a need to pass the address of the private data as a function argument when calling one function from another function. Therefore, the program entry for the called function which operates on the private data must contain the argument specifying the address of the private data, and consequently the program becomes quite complicated. In addition, in the called function, there is a need to store the address of the private data on the stack. Furthermore, the access to the private data can be made only indirectly through the address, so the execution efficiency is lowered.
(II) Method for expanding variables in correspondence to multiple threads, and accessing variables by using thread IDs.
In this method, each part of the thread private data is globally declared as a global sequence having as many elements as there are threads. An access from each thread is made by using a thread ID which is uniquely assigned to each thread in advance. In this method, if the thread IDs are not supported at the OS level, it becomes necessary to support the thread IDs themselves by using the specialized method (I) described above. Moreover, even when the thread IDs are supported at the OS level, this method is not applicable to a case in which a number of threads changes dynamically.
An exemplary program written in the C programming language adopting this method is shown in FIG. 2. In this program of FIG. 2, the private data "private.sub.-- data" is declared as a global sequence. A number of elements allocated to this global sequence is defined by a predetermined constant "THREAD.sub.-- NUM". The access to the private data is made by using the thread ID "THREAD.sub.-- SELF".
In this example, the number of elements involved in the global sequence is unchangeably fixed to "THREAD.sub.-- NUM", so that there is a limit to a number of threads that can be accommodated. In addition, in order for each thread to be able to make an access to the private data, it is necessary for the thread ID "THREAD.sub.-- SELF" of each thread to be supported at the OS level.
Thus, in this conventional data management method for supporting the global data privately by each thread, there is a need to provide the thread IDs. In addition, this method cannot deal with a situation involving the generation and termination of threads in which a number of threads changes dynamically. Here, there is a proposition to modify this method to be able to deal with the generation and termination of threads by utilizing a structure employing the hashing of the thread IDs, but this in turn lowers the execution efficiency considerably. Furthermore, the maximum number of threads that can be supported is limited by the number of elements defined in the global sequence in this method.