1. Field of the Invention
The present invention relates to the field of processors within a computer system. More specifically, the present invention relates to the field of performance enhancements for processors that allow out-of-order execution of memory access instructions.
2. Related Art
Processors (e.g., microprocessors) of computer systems execute instructions that access information stored in memory units or peripheral devices of the computer system. A peripheral device is often called an input/output (I/O) device. A read instruction, also called a "load" instruction, causes the processor to request information from a memory unit or from a peripheral device. Conversely, a write instruction, also called a "store" instruction, causes the processor to write information out to a memory unit or to a peripheral device. Whether it be a memory unit or a peripheral device that is being accessed by the processor, a memory location is often used within the instruction to indicate the particular location of the source or the destination of the information. For instance, if a memory unit is being accessed, the memory location in the instruction typically corresponds to a cell of the memory unit. However, if a peripheral device is being accessed, the memory location in the instruction typically corresponds to a register located within the peripheral device. Within the computer system, memory locations that correspond to memory units are said to be within the "memory space" of the address space of the computer system and memory locations that correspond to peripheral devices are said to be located within the "I/O memory space" of the address space of the computer system.
A performance advantage is achieved when a read data instruction is not blocked by previous store data instructions. Once the store data is put into the write buffer, the following read data instruction can proceed as long as there is not data dependency between stored data and the loading data. As a result, the system bus shows out-of-order transactions and the bus activity does not follow the exact instruction flow. The instruction pipeline is blocked by the pending read but not the pending write. Flushing out the write buffer and moving the instruction pipeline can proceed concurrently. Thus, read instructions bypassing write instructions is a desirable feature to enhance the performance of a microprocessor.
FIG. 1A and FIG. 1B illustrate a read bypass where the instructions are not data dependent. An instruction stream 10 is fetched from memory and the instructions are coded in sequence such that instructions on the top (e.g., 10a) are fetched earlier than instructions on the bottom (e.g., 10n). Instruction stream 10 represents the "in-order" sequence of the instructions. If the instructions were processed in-order, read A instruction 10c would be processed before write C instruction 10d which would be processed before write X instruction 10e, etc. The processor places the write instructions (10d and 10e) into a write buffer 20 located within the processor thereby allowing read G instruction 10f to bypass these write instructions (10d and 10e) while the write instructions (10d and 10e) perform their relatively slow external memory access operation. The result is shown in FIG. 1B where the modified instruction stream 10' is processed by the pipeline 30 of the processor. As shown, read G instruction 10f is allowed to execute after read A instruction 10c thereby bypassing the earlier write instructions (10d and 10e) which are pending in the write buffer 20. This bypassing can occur only if the read instruction (read G 10f) does not need the same data as involved in the write instructions (10d and 10e) which are pending in the write buffer 20. Otherwise, the instructions are said to be "data dependent." Therefore, it is an effective performance enhancement feature that the read instruction bypass the write instruction while the write data is in the write buffer 20, as long as reading data does not have any data dependency.
FIG. 1C illustrates a condition whereby instructions are data dependent. An instruction stream 40 is shown having a write C instruction 40d that is stored in the write buffer 20. However, the subsequently fetched read C instruction 40f uses the same data involved in the write C instruction 40d. In this case, the read C instruction 40f is not allowed to bypass the write C instruction 40d so that data integrity is maintained. The read C instruction 40f can only execute after the contents of write buffer 20 are all processed. In this case, the instruction order as shown in instruction stream 40 is maintained by the processor.
In view of the above, it is important for the processor that performs read bypassing to have a mechanism for detecting data dependencies between a write instruction in the write buffer 20 and a subsequently fetched read instruction. For memory accesses within the "memory space" (e.g., instructions that access data of memory cells of memory units), the data is checked in the prior art by comparing the memory addresses of the read and write instructions. their addresses match, the instructions have a data dependency, so the read instruction is not allowed to bypass the pending write instruction(s). More specifically, in the prior art, the memory addresses of the write instructions pending within the write buffer are compared to the addresses of a subsequently fetched read instruction that is a bypass candidate. If an address match occurs, the read instruction is not allowed to bypass the pending write instruction(s).
Data dependencies with respect to memory accesses in the I/O memory space (e.g., read and write instructions to and from a peripheral device) cannot be detected by merely checking the memory addresses of the instructions. The memory locations associated with read and write instructions with respect to peripheral devices involve writing to and reading from registers within the peripheral device. Writing to some registers of a peripheral device can change the values of other registers within the same device or within other devices. That is to say, registers within a peripheral device can be data dependent without respect to their memory location. Therefore, merely checking memory location coincidence will not detect all data dependencies between instructions that read from and write to a peripheral device. What is needed is a read bypassing mechanism that does not merely compare addresses of read and write instructions to determine the absence of data dependencies between the instructions.
One prior art mechanism maintains data integrity with respect to read and write addresses of the I/O memory space. This solution provides a special instruction that is used by a programmer which causes the write buffer 20 to flush out all of its contents (e.g., process all pending write instructions stored in the write buffer 20) before a subsequently fetched read instruction is executed. Some processors refer to this special instruction as the "sync" or synchronizing instruction. With respect to the Power PC.TM. processor, this instruction is the "EIEIO" instruction. Basically, when this instruction is received, the read instruction is halted until each of the pending write instructions within the write buffer 20 are fully processed. Although this solution maintains data integrity, it requires extra processor cycles to execute the special instruction. These extra cycles reduce the performance gain that is desired when implementing read bypassing in the first place. What is needed is a system and method for responding to data dependencies within the I/O memory space that does not consume extra processor cycles to implement. Another disadvantage of the above solution is that read bypassing is effectively eliminated in all read and write instructions that involve the I/O memory space because the write buffer 20 is forced flushed. What is needed is a system and method for read bypassing that does not eliminate all read bypassing involving the I/O memory space.
Another disadvantage of the above solution is that the programmer is required to insert the special instruction which causes write buffer flushing and this method is subject to human error. Should the programmer forget to insert the special instruction, data integrity with respect to memory transactions within the I/O memory space can be severely compromised. This can lead to a system failure. What is needed is a system and method for read bypassing that responds to data dependencies within the I/O memory space and that is not dependent on a special user inserted instruction.
Accordingly, the present invention provides a read bypassing mechanism and method for a processor that does not merely compare addresses of read and write instructions to determine data dependencies. The present invention further provides a read bypass system that responds to data dependencies between instructions that access the I/O memory space and that does not consume extra processor cycles to implement. The present invention yet provides a system and method for read bypassing that does not eliminate all read bypassing for instructions that involve the I/O memory space. The present invention also provides a read bypass system that responds to data dependencies between instructions that access the I/O memory space and that is not dependent on a special user inserted instruction. These and other advantages of the present invention not specifically mentioned above will become clear within discussions of the present invention presented herein.