This invention relates generally to executing instructions in a computer system, and more particularly to a method and apparatus for executing instructions in parallel.
Conventional digital computers process instructions of a program in accordance with fetch and execute cycles in which instructions are fetched from memory and executed by the processor. Each fetch or execute cycle may include a number of intermediate steps. For example, the processing of an instruction manipulating data may require the fetching of additional operands from memory, and the storing of data generated as a result of the execution of the instruction. Or, the processing of an instruction manipulating the control flow of the program may examine the value of the operands and alter the program flow accordingly.
In conventional low speed computers, the fetch and execute cycle are performed for one instruction at the time. That is, the fetch for the next instruction does not occur until the previous instruction has been completely processed. The hardware that is used during the fetching steps remains idle until it is needed for the next instruction.
However, the speed at which computers can process instructions and data has increased much faster than the speed at which memory can supply the instructions and data to the computer. This memory latency can be hidden by processing multiple instructions concurrently. For example, the next instruction is fetched from memory prior to storing the result of the previous instruction. This is a simple form of instruction level parallelism or pipelining.
Also, the size of computer chips has increased faster than the speed of the logic circuits on the chip. Therefore, a further performance advantage can be gained by using the larger available space on the silicon to process multiple instructions in parallel.
The total throughput of the computer can be further increased by pre-processing or accelerating the execution of certain instructions, particularly instructions which require a large number of cycles, or which may be subject to long latencies. For example, a load instruction may attempt to read data which are not stored in a local fast memory such as a cache. In this case a subsequent instruction operating on the data can not be executed until the data are read, possibly from a virtual memory device. It would be an advantage to move such instructions up in the execution sequence so that the data become available earlier.
Pre-processing of instructions is sometimes known as speculative execution. Speculative execution is also known as the condition where all of the dependencies, control and data, for the instruction are not resolvable at the time the instruction is generated.
Various techniques are known for providing speculative execution of instructions. For example, speculative execution is sometimes engineered at run-time by dynamically processing instructions out of their original order. Other techniques, reorder the execution sequence of the instructions of a program at the time that a source program is compiled into a run-time program or object code. Some techniques require the use of complex hardware not readily adaptable to simple processors such as RISC computers.
A serious problem with the known techniques of speculative execution is the correct detection and management of exception conditions that occur, particularly when multiple instructions are executed in parallel. Exception conditions are signals used to indicate unexpected conditions, for example, the processing of the instruction could not be completed, or the result generated requires further attention, due to, for example, address faults, arithmetic inconsistencies, and the like.
An exception condition that occurred for a speculative instruction which should not have been executed must be ignored. On the other hand, an exception condition for a speculative instruction that was supposed to be executed must be signaled. Most known techniques execute instructions speculatively with the understanding that the instructions will not generate exception conditions. Enforcing these restrictions generally constrain the degree of parallelism possible. Should an exception condition occur anyway, known techniques generally abort processing altogether.
Recovery from an excepting speculative instruction should be possible. This is particularly true for instructions which are executed speculatively in parallel. For example, if multiple instructions are executed in parallel, only the instruction which encountered the exception condition should be reprocessed. The data generated by the instructions which were successfully executed in parallel should not be effected.
Therefore, there is a need for a mechanism which allows for the execution of instructions in parallel without undue restrictions, especially when exception conditions are encountered.
There is provided a mechanism for executing computer instructions in parallel. The mechanism includes a compiler for generating and grouping instructions into a plurality of sets of instructions. The sets of instructions to be executed in parallel. The compiler assigns a unique identification to the instructions of each of the sets so that the instructions of the individual sets can be distinguished when executed in parallel.
A computer system is provided for executing the instructions in parallel. The computer system has a real state and a speculative state, the computer system executing a particular set of instructions in the speculative state if the instructions of the particular set have dependencies which can not be resolved until the instructions are actually executed. Unresolvable dependencies can be data and control dependencies. The computer system generates speculative data while executing instructions in the speculative state. The speculative data include identification for associating the speculative data with the particular set of instructions.
A mechanism is provided to detect any exception conditions which occur while executing the particular set in the speculative state. If the particular set is subject to an exception condition, the instructions of the set are re-executed to resolve the exception condition, and to incorporate the speculative data in the real state of the computer system. During the re-execution, only the set which encountered the exception condition is re-executed. Any other sets of instructions, which were executed in parallel with the particular set, and which are not subject to an exception condition are not re-executed.