In computer architecture, a data hazard is a problem that can occur in a pipelined processor. Instructions in a pipelined processor are performed in several stages, so that, at any given time, several instructions are being executed and the instructions may not be completed in the desired order. A data hazard occurs when two or more of these simultaneous, and possibly out-of-order, instructions conflict and cause an error.
Data hazards occur when data is modified. A data hazard can occur in the following situations: 1) Read after Write (RAW): An operand is modified and read soon after. Because the first instruction may not have finished writing to the operand, the second instruction may use incorrect data; 2) Write after Read (WAR): Read an operand and write soon after to that same operand. Because the write may have finished before the read, the read instruction may incorrectly get the new written value; and 3) Write after Write (WAW): Two instructions that write to the same operand are performed. The first one may finish second and therefore leave the operand with an incorrect data value. The operands involved in data hazards can reside in memory or in a register.
A pipelined processor's instruction set may contain special instructions which have exceptionally high latencies relative to standard instructions. A primary example would be an instruction which fetches data from memory. The problem of data hazards is relatively easy to avoid for low latency instructions i.e. instructions that can be completed in a small number of clock ticks, because it is then relatively easy to ensure that the instructions within a particular thread are completed in the issued order. However, when high latency instructions are included in a thread, the problem of data hazards is more significant because there is more likelihood that the instructions in a particular thread will not complete in the issued order.
These problems arise in all sorts of circumstances e.g. in 3D graphics processors, in Central Processing Units (CPUs) including dedicated media CPUs in which real time inputs are being received, and in communication with multi-processor systems.
To deal with the high latency instructions, the processor should ideally provide a mechanism to swap out a thread which is waiting for instructions to complete. However, certain requirements also have to be fulfilled.
Firstly, in a multi-threaded processor, many threads might have potential data hazards i.e. instructions which depend upon preceding instructions being completed, before they are processed.
Secondly, each thread might have a large number of long latency instructions, which could all be adjacent in the stream. It must be possible to allow the return data from long latency instructions to come back in a different order from which they were dispatched. Given that there could be a number of long latency instructions being processed at one time, we must reduce as much as possible processor stalling due to data hazards from long latency instructions.
Thirdly, it has to be possible to skip over any instructions in the thread where there is a branch in the thread, especially those which might cause a data hazard, because they depend upon preceding instructions being completed before they are processed.
Fourth, it must be possible to read results in a different order than they were written. Fifth, there shall be no penalty for multiple read accesses of destinations. Sixth, it also must be permitted for the same destination to be written to and re-used as a destination for another long latency instruction.
Finally, it is preferable that no dedicated or mass storage is needed in processing the long latency instructions and potential data hazard instructions. It is also preferable that gate costs are kept to a minimum.
It is an object of the invention to provide a method and apparatus for processing threads which mitigates or overcomes the problem of data hazards in long latency operations.