This invention relates in general to re-ordering or advancing load operations in a computer program and in particular to a mechanism for determining whether an advance load has been invalidated by a subsequent store operation.
It is generally desirable to reorder selected instructions in a computer program to improve program execution efficiency. One form of such reordering is that of moving or speculating instructions which load data from certain memory locations as well as instructions which may use the data received in the load instructions with respect to store instructions. A hazard associated with such reordering may exist where a store instruction, which succeeds the speculated load instructions and instructions using loaded data (xe2x80x9cusexe2x80x9d instructions), accesses the same memory location as one or more speculated load instructions. In this case, the speculation will generally have had the effect placing incorrect data into registers accessed by the speculated instructions. Where such a conflict occurs, execution of the load instruction and any xe2x80x9cusexe2x80x9d instructions (instructions using the loaded data) will be invalidated and undone. Recovery will generally be executed which may include canceling, re-fetching, and re-executing the instructions rendered invalid by the conflict with the store operation.
One prior art approach to responding to such a conflict arising from a speculation is to allow the store instruction which conflicts with the speculated load instruction to become the oldest instruction in a pipeline and retire, while instructions after the store are canceled, re-fetched, and re-executed once the store instruction has been committed to a cache or memory hierarchy.
One problem arising in the prior art is that there is generally no software control over the storing, loading, and reordering operations at run-time. Another problem is that the use of hardware imposes limitations on the instruction window size, thereby limiting the available code optimizations. Furthermore, there is a generally a large recovery penalty in the prior art, where the extent of such penalty generally depends upon the way in which the hardware implements the optimization process.
Therefore, it is a problem in the art that hardware optimization implementations must generally perform optimizations within a limited instruction window size.
It is a further problem in the art that a large recovery penalty results in a hardware controlled optimization process.
It is a still further problem in the art that there is there is generally no software control over the storing, loading, and re-ordering operations at run-time.
These and other objects, features and technical advantages are achieved by a system and method which splits original load instructions into advanced load instructions and check instructions. The advanced load instructions are preferably placed in a more advanced location in a code sequence than corresponding original load instructions and operate to load data. Each check instruction preferably operates to check the validity of advanced load instructions employing a particular register, identifies the most recent advanced load instruction employing that register, and validates the identified most recent advanced load instruction by comparing it to store instruction address information pending in an instruction queue or pipeline. Where no match is found with store instruction address information, the speculation is preferably considered to have succeeded, thereby indicating that the placement of the advanced load instruction did not conflict with any store instruction and that the speculation of this advanced load instruction was therefore successful. Generally, upon splitting an original load instruction, as mentioned above, an advanced load instruction corresponding to the original load instruction is placed before a selected store instruction, and a check instruction corresponding to the original load instruction is kept in the location of the original load instruction in an optimized code sequence.
Identification of the Most recent advanced load instruction and validation of this advanced load instruction against store address information are preferably accomplished independently and in parallel, thereby preferably improving overall cycle time and effecting transmission of conflict information (the xe2x80x9chitxe2x80x9d or xe2x80x9cmissxe2x80x9d status of a comparison with store address information) to an exception handling unit early enough to initiate recovery.
Preferably, one or more tables are employed for storing information associated with advanced load instructions. The tables employed for this purpose are preferably fully associative, thereby enabling comparisons of one datum such as a store instruction memory address with any data entry stored in the table. Fully associative tables also preferably enable register numbers and memory addresses to be stored anywhere in the table, thereby obviating a need to index the table according to register number. In a preferred embodiment, data preserved in association with an advanced load instruction may include the register number to which an instruction loaded data, the memory address from which the data was loaded, and a log of the validity status of the advanced load instruction. Such information may be kept in a single table, or stored in corresponding locations in a plurality of separate tables.
Generally, two results are possible when an advanced load instruction is checked for conflict with store memory addresses. Specifically, the check may be a xe2x80x9chitxe2x80x9d or a xe2x80x9cmiss.xe2x80x9d Herein, a xe2x80x9chitxe2x80x9d refers to a case where the advanced load instruction does not conflict with known store instruction addresses, and the advanced or re-ordered load instruction may remain in its modified location without causing any adverse side effects for overall program execution. Herein, a xe2x80x9cmissxe2x80x9d refers to a case where the memory address associated with an advanced load instruction does conflict with a store instruction memory address. A xe2x80x9cmissxe2x80x9d generally triggers one of two possible responses. A first response preferably includes issuing a reload of the data as part of the check operation. A second response preferably causes a re-steer to recovery code which recovery code implements a reload of the data and re-execution of instructions which employed loaded data. Herein, the term speculated load instruction generally refers to an advanced load instruction.
Therefore, it is an advantage of a preferred embodiment of the present invention that table storage is fully associative, thereby enabling flexible placement of entries in the table.
It is a further advantage of a preferred embodiment of the present invention that software control of the optimization process enables deployment of an instruction window of unlimited size.
It is a still further advantage of a preferred embodiment of the present invention that software control of the recovery process conserves execution time.
It is a still further advantage of a preferred embodiment of the present invention that parallelizing the comparison processes for addresses and register numbers (or register identifications) enables communication of a hit/miss status for a check operation on an advanced load instruction to an exception handling unit early enough for a processor to contain any errors arising from a use of invalid loaded data.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.