1. Field of the Invention
This invention relates to computer systems and, more particularly, to efficient multi-threaded operation of computer systems.
2. Description of the Related Art
Modern computer systems often utilized multiple processors executing in parallel to increase overall operating efficiency. A variety of configurations are possible including separate microprocessors, a single microprocessor that includes multiple cores, or a combination of the two. When an application requires the execution of a complex task, the task may be separated into several threads with each thread assigned to a different processor or core. As used herein, a thread is a stream of instructions that is executed on a processor.
Many applications are written that use multiple threads to accomplish a single unit of work. For example, a unit of work may be to execute the following loop:
For (i=0; i<N; i++) {a[i]=b[i];}
The range of 0 . . . N may be divided among several processors such that each one completes a portion of the total iterations through the loop. Once a given processor completes its work, it may have to wait for the other processors to complete their work before continuing to the next unit of work. If the given processor begins the next unit of work earlier, values that are needed from the other processors may not yet be available or may not yet be set to the values needed to start the next work unit, leading to potentially erroneous results. Consequently delays may be introduced in the operation of one or more processors, reducing overall operating efficiency.
The delays described above may be considered to be a synchronization cost of multi-threaded operation. The degree to which such delays reduce the efficiency of a multi-threaded application depends on the size of the work units that are divided among the available processors. Overall performance may be improved if the synchronization costs are less than the gains available from parallelization of a task. Correspondingly, parallelization of tasks may be effective for any tasks for which the synchronization costs are sufficiently small.
A variety of technologies may be incorporated in a computer system to support multiple threads running in parallel. Some technologies, such as transactional memory, may be most effective for smaller units of work. Unfortunately, a given synchronization cost may generally be proportionately higher for smaller units of work. Therefore, what is needed is systems and methods of reducing synchronization costs in multi-threaded, multi-processor operations.