1. Field of the Invention
This invention relates generally to the field of managing tasks that an instruction processor is assigned to do within a computer system having multiple instruction processors.
2. Background Information
In the field of multiprocessor computer systems, it can be difficult to strike the right balance between and among the processors so that the computing tasks are accomplished in an efficient manner with a minimum of overhead for accomplishing the assigning of tasks.
The preferred design should not allow a majority of the available tasks to be assigned to a single processor (nor to any other small subset of all processors). If this occurs, the small subset of processors is kept too busy to accomplish all its tasks efficiently while others are waiting relatively idle with few or no tasks to do and the system is not operating efficiently. It should therefore have a load leveling or work distribution scheme to be efficient.
Also, to take advantage of cache memory (which provides for quicker access to data because of cache's proximity to individual processors) an assignment of tasks based on affinity with a processor or processor group that has the most likely needed data already in local cache memory(ies) to bring about efficiencies should also be designed-in. As is understood in this art, where a processor has acted on part of a problem (loading a program, running a transaction, or the like), it is likely to reuse the same data or instructions in its local cache, because these will be found there once the problem is begun. By affinity we mean that a task, having executed on a processor, will tend to execute next on that same processor or a processor within that processor's group. (Tasks begun may not complete due to a hardware interrupt or for various other reasons not relevant to our discussion). Where more than one processor shares a cache, the design for affinity assignment could be complicated, and complexity can be costly, so the preferred design should be simple.
These two goals, affinity and load leveling, seem to be in conflict. Permanently retaining task affinity could lead to overloading some processors or groups of processors. Redistributing tasks to processors to which they have no affinity will yield few cache hits and slow down the processing overall.
These problems only get worse as the size of the multiprocessor computer systems gets larger.
Typically, computer systems use switching queues and associated algorithms for controlling them to assign tasks to processors. Typically, these algorithms are considered an Operating System (OS) function. When a processor “wants” (is ready for) a new task, it will execute the (usually) re-entrant code that embodies the algorithm that examines the switching queue. It will determine the next task to do on the switching queue and do it. However, while it is determining which task to do, other processors that share the switching queue may be waiting on the switching queue, which the first processor will have locked in order to do the needed determination.
A known solution to the leveling vs. affinity problem is to have a switching queue (SQ) per group and to add an extra switching queue to the switching queues already available. This meant that each group would exhaust tasks in its own queue before all seeking tasks from the extra SQ. Thus the bottle-neck or choke-point was simply moved to a less used SQ where conflicts would only develop when more than one task handler needed a new task at the same time as another task handler was seeking one. Of course, as the number of task handlers increases, the lock conflicts for obtaining such an additional queue become a choke-point in the system operations. Also, when the overflow or extra SQ bore no relation to the handler's affinity, the value of cache memory was denigrated (cache hits would decline) because no affinity advantage would accrue to such a system.
Accordingly, there is a great need for efficient dispatcher programs and algorithmic solutions for this activity.