In a computer, a number of different organizational techniques may be used for increasing execution speed. One technique is execution overlap. Execution overlap is based on the notion of operating a computer like an assembly line with an unending series of operations in various stages of completion. Execution overlap allows these operations to be overlapped and executed simultaneously.
One commonly used form of execution overlap is pipelining. In a computer, pipelining is an implementation technique which allows a sequence of the same operations to be performed on different arguments. Computation to be done for a specific instruction is broken into smaller pieces, i.e., operations, each of which takes a fraction of the time needed to complete the entire instruction. Each of these pieces is called a pipe stage, or simply a stage. The stages are connected in a sequence to form a pipeline in which of the instruction enter at one end, are processed through the stages, and exit at the other end.
Although pipelining may increase the throughput of certain instructions of the computer, it does have limitations. For example, one drawback of pipelining is that the operation performed by each stage is unvarying, and thus, pipelining does not typically allow execution overlap for all operations performed by the computer. This decreases the overall throughput of the computer. Another limitation of pipelining is that each stage typically performs only one operation.
Achieving execution overlap in a computer where one of the stages performs a variety of concurrent operations is a particularly difficult problem. In such a situation it is typically necessary to schedule the operations to avoid overcommitting computing resources in a subsequent stage. One technique which is taught by the prior art is the use of semaphores. Essentially, a semaphore is a flag which indicates whether a subsequent stage is ready to receive data. By ascertaining the condition of the flag, a preceding stage can transmit results from its concurrent operations in a fashion which allows some execution overlap between the stages. However, the use of semaphores requires relatively complex communication protocols between stages.
Another solution to scheduling is to simply let the all of the concurrent operations finish before transferring data to the subsequent stage. This approach provides a very simple mechanism for effecting data transfers between stages. However, it does not allow the operations of various stages to overlap, and thus, it severely reduces the overall throughput of the computer.
In summary, there is a need for a mechanism in a computer which will allow execution overlap to the greatest extent possible, regardless of whether there are a plurality of concurrent operations in a particular stage. The mechanism should also be simple and inexpensive to implement.