1. Field of the Invention
The invention relates to a process for improving the performance of a multiprocessor data processing system comprising a job queue and controlled by an operating system of the preemptive type.
The invention particularly relates to a process for the optimized assignment of tasks to a processor in a multiprocessor system of this type so as to obtain said improvement in performance.
The invention also relates to a data processing system architecture for implementing this process.
The invention particularly applies to conventional symmetric multiprocessor systems of the type known as “SMP.” However, it also applies to multiprocessor systems with a nonuniform memory access architecture, known by the name “NUMA.”
Even more particularly, the invention applies to an operating system environment of the “UNIX” (registered trademark) type. But it must be clearly understood that the process of the invention also applies to other operating systems of the preemptive type. However, to illustrate the concept without in any way limiting the scope of the invention, the following will keep to the case of the “UNIX” environment and to the framework of the above-mentioned “NUMA” type of architecture, unless otherwise indicated.
2. Description of Related Art
One of the essential functions of a preemptive operating system is to allocate processor time to each of the various tasks being executed in parallel in the system.
In the prior art, a standard solution for solving this problem consists of storing in a queue the tasks that must be executed, and each processor draws from this queue in order to execute a task, until a predetermined event indicates to the processor in question that it should execute another task. The processor then sends a request, which is transmitted to a distributing device, commonly called a “dispatcher.”
This solution has the advantage of ensuring that a processor is only inactive if the queue is empty, i.e., if there is currently no task that can be executed.
On the other hand, this solution has several drawbacks, including the following:                when the number of processors and the number of tasks to be processed increase, contention in devices known as locks, i.e., devices that protect access to the above-mentioned queue, increases to a substantial degree; and        so-called “level 2” caches are sometimes associated with each processor; it is therefore advantageously preferable for a task to be executed in only one processor, in order to benefit from the information stored in the “level 2” cache associated with it.        
The above-mentioned standard solution is incapable of handling such an operation naturally. Thus, it is also known to use additional algorithms that allow this mode of operation. However, these algorithms are not without drawbacks, either. They become increasingly costly in terms of the degradation of the global performance of the system as the number of tasks and/or the number of processors increases.