The present invention relates to an apparatus and method for preventing data corruption in a processing unit that speculatively issues load requests. In particular, the present invention relates to an apparatus and method for preventing register data corruption resulting from the speculative issue of load requests without the correct addresses, followed by replay of such load requests.
FIG. 1 illustrates, in block diagram form, the architecture of Processor 20. Processor 20 includes Instruction Fetch Unit 22, Scheduling Unit 24, Execution Pipes 26, Register File 28, Cache Controller 32, Cache 36. Processor 20 communicates with Main Memory 38, as necessary to obtain information not available in cache 36. Instruction Fetch Unit 22 fetches instructions from instruction memory (not shown) and couples them to Scheduling Unit 24. Scheduling Unit 24 assigns an identifier (ID) to each instruction prior to coupling it to Execution Pipes 26, where each instruction is executed. Cache Controller 32 tracks outstanding access requests via Request Queue 34. In response to a cache hit, data may be loaded into Register File 28, which includes a number of registers.
Out-of-order (OOO) program execution enables Processor 20 to achieve execution speeds that would not be possible if instructions were executed in program order. The order in which Scheduling Unit 24 couples the instructions to Execution Pipes 26 is determined by the availability of necessary resources of Execution Pipes 26 for a particular instruction, rather than program order. Issuing load requests requires special handling by Scheduling Unit 24 because of the possibility of address dependence between two load requests. A load request shall be referred to as a xe2x80x9csourcexe2x80x9d if it provides the address to be used in another load request. The load request taking its address from the source shall be referred to as a xe2x80x9cconsumerxe2x80x9d. To reduce execution time Scheduling Unit 24 assumes no register dependencies and speculatively issues a consumer load request to Execution Pipes 26 after issuance of, but prior to completion, of the associated source load request. This practice is referred to as speculative issuance. Speculative issuance causes no problem so long as both a source and its consumer load request hit in Cache 36. When that occurs, the source load request will complete execution prior to its consumer load request, thereby providing it with the correct address. However, if a source load request misses in Cache 36, then its associated speculatively issued consumer load request may complete execution first with an incorrect, or xe2x80x9cbadxe2x80x9d, address for the data to be loaded into a register, thus loading the wrong data into the register.
Scheduling Unit 24 attempts to correct this problem by xe2x80x9creplayingxe2x80x9d the consumer load with the correct address after completion of its associated source load. (In the interests of brevity, in the following discussion a speculatively issued consumer load with a bad address will be referred to as a xe2x80x9cbad consumerxe2x80x9d and the replay of the bad consumer will be referred to as a xe2x80x9creplay consumerxe2x80x9d.) A replay consumer resembles its associated bad consumer, including the same ID and register; however, the replay consumer includes a different, correct, address. Even with replay of bad consumer loads corruption of register data is possible. This is because there is no guarantee that a replay consumer will complete execution after the bad consumer that gave rise to it.
FIGS. 2A, 2B and 2C illustrate how register data corruption can occur when a bad consumer load and its associated replay consumer both miss in Cache 36. In these figures and the following discussion, the source load request as is identified as xe2x80x9cI1xe2x80x9d. While both the consumer load request and its replay are assigned the ID xe2x80x9cI2xe2x80x9d, the two can be distinguished by their addresses. Consumer load request I2 is initially issued with a bad address of B and it is subsequently determined that the correct address should have been C. Thus, in these figures the bad consumer is denoted as xe2x80x9cI2 Ld [B], R2xe2x80x9d and the replay consumer is denoted xe2x80x9cI2 Ld [C], R2xe2x80x9d. At a time xcfx841 both I1 and I2 have missed in Cache 36. FIG. 2A illustrates that at time xcfx841 both the source and bad consumer load requests, I1 and I2, are pending in Request Queue 34. Sometime after xcfx841 and prior to xcfx842, source I1 completes placing the data stored at address A of Main Memory 38 into register R1 of Register File 28. FIG. 2B reflects this showing a value of xe2x80x9c1010bxe2x80x9d in R2 at xcfx842. Subsequently, Scheduling Unit 24 realizes that consumer load I2 was issued with a bad address and so issues a replay request with the correct address; i.e, Scheduling Unit 24 issues xe2x80x9cI2 Ld [C], R2xe2x80x9d. FIG. 2A reflects that at time xcfx842 the replay consumer has also been forwarded to Request Queue 34. At this time the bad consumer still has not completed. At xcfx843 the replay consumer load completes, writing the value for address C of Main Memory 38, 1111b, into register R2 of Register File 28, as illustrated in FIG. 2C. This is the correct data. At xcfx844 the bad consumer completes, corrupting the data in register R2 of Register File 28 by overwriting it with the value for address B of Main Memory 38, 0000b, as illustrated in FIG. 2C. While the preceding discussion of register data corruption was based upon a processor with a single level cache, the problem also occurs under similar circumstances in processors with multiple levels of cache.
Thus, a need exists for a means of preventing register data corruption arising from completion of bad consumer load requests.
Briefly described, the circuitry of the present invention includes a request queue and bad address handling circuit. The request queue includes an entry for each outstanding load requesting access to a cache. Each request queue entry includes a valid bit, an issue bit and a flush bit. The state of the valid bit indicates whether or not there is a valid request associated with the entry. The issue bit indicates whether the load request has been issued to the cache and the flush bit indicates whether the data received in response to the request should be forwarded to the register file. The bad address handling circuit responds to a replay load request by manipulating the state of the valid or flush bit of the relevant request queue entry to prevent completion of bad consumer load requests. The bad address handling circuit includes a validation circuit and a flush circuit. The validation circuit alters the state of the valid bit of the relevant request queue entry in response to the replay load request based upon the state of issue bit for that request queue entry. If the issue bit indicates that the load access request has not yet been issued to the cache, then the validation circuit alters the state of the associated valid bit to prevent the issuance of that load access request to the cache. On the other hand, if the bad consumer has already been issued to the cache, then the flush circuit responds by altering the state of the flush bit to prevent the data received in response to the bad consumer from being loaded into the register file. Thus, the circuitry of the present invention prevents completion of a speculatively issued consumer load request with a bad address for which a replay has been initiated. By doing so, the circuitry of the present invention prevents corruption of register data that can occur when the bad consumer load request is allowed to complete after its associated replay.