This invention relates generally to a system and method for updating a fully associative array and more particularly to updating a fully associative array having a plurality of data banks and a plurality of ports connected thereto.
It is generally desirable to reorder selected instructions in a computer program to improve program execution efficiency. One form of such reordering is that of moving or speculating instructions which load data from certain memory locations as well as instructions which may use the data received in the load instructions with respect to store instructions. A hazard associated with such reordering may exist where a store instruction, which succeeds the speculated load instructions and instructions using loaded data (xe2x80x9cusexe2x80x9d instructions), accesses the same memory location as one or more speculated load instructions. In this case, the speculation will generally have had the effect placing incorrect data into to registers accessed by the speculated instructions. Where such a conflict occurs, execution of the load instruction and any xe2x80x9cusexe2x80x9d instructions (instructions using the loaded data) will be invalidated and undone. Recovery will generally be executed which may include canceling, re-fetching, and re-executing the instructions rendered invalid by the conflict with the store operation.
One prior art approach to responding to such a conflict arising from a speculation is to allow the store instruction which conflicts with the speculated load instruction to become the oldest instruction in a pipeline and retire, while instructions after the store are canceled, re-fetched, and re-executed once the store instruction has been committed to a cache or memory hierarchy.
One problem arising in the prior art is that there is generally no software control over the storing, loading, and reordering operations at run-time. Another problem is that the use of hardware imposes limitations on the instruction window size, thereby limiting the available code optimizations. Furthermore, there is a generally a large recovery penalty in the prior art, where the extent of such penalty generally depends upon the way in which the hardware implements the optimization process.
Therefore, it is a problem in the art that hardware optimization implementations must generally perform optimizations within a limited instruction window size.
It is a further problem in the art that a large recovery penalty results in a hardware controlled optimization process.
It is a still further problem in the art that there is there is generally no software control over the storing, loading, and re-ordering operations at run-time.
These and other objects, features and technical advantages are achieved by a system and method which splits original load instructions into advanced load instructions and check instructions. The advanced load instructions are preferably placed in a more advanced location in a code sequence than corresponding original load instructions and operate to load data. Each check instruction preferably operates to check the validity of advanced load instructions employing a particular register, identifies the most recent advanced load instruction employing that register, and validates the identified most recent advanced load instruction by comparing it to store instruction address information pending in an instruction queue or pipeline. Where no match is found with store instruction address information, the speculation is preferably considered to have succeeded, thereby indicating that the placement of the advanced load instruction did not conflict with any store instruction and that the speculation of this advanced load instruction was therefore successful. Generally, upon splitting an original load instruction, as mentioned above, an advanced load instruction corresponding to the original load instruction is placed before a selected store instruction, and a check instruction corresponding to the original load instruction is kept in the location of the original load instruction in an optimized code sequence.
Identification of the most recent advanced load instruction and validation of this advanced load instruction against store address information are preferably accomplished independently and in parallel, thereby preferably improving overall cycle time and effecting transmission of conflict information (the xe2x80x9chitxe2x80x9d or xe2x80x9cmissxe2x80x9d status of a comparison with store address information) to an exception handling unit early enough to initiate recovery.
Preferably, one or more tables are employed for storing information associated with advanced load instructions. The tables employed for this purpose are preferably fully associative, thereby enabling comparisons of one datum such as a store instruction memory address with any data entry stored in the table. Fully associative tables also preferably enable register numbers and memory addresses to be stored anywhere in the table, thereby obviating a need to index the table according to register number. In a preferred embodiment, data preserved in association with an advanced load instruction may include the register number to which an instruction loaded data, the memory address from which the data was loaded, and a log of the validity status of the advanced load instruction. Such information may be kept in a single table, or stored in corresponding locations in a plurality of separate tables.
In a preferred embodiment, a fully associative table (or array) is deployed which includes a plurality of data banks and a plurality of ports able to write to the plurality of data banks, or xe2x80x9cbanks.xe2x80x9d The inventive mechanism thereby preferably enables simultaneous updates of the table by employing separate ports writing to separate banks in parallel. Such parallel operation preferably operates to enable multiple table updates to be effected during a single cycle.
In a preferred embodiment, for each prospective entry at a port, the inventive mechanism employs a set of factors to determine which bank and which entry location within a bank the prospective entry will be written to. Regarding bank selection, the factors generally include whether or not a match exists between the prospective entry and an existing table entry, a default bank connection for the port at which the prospective entry resides, and the operation of randomization logic to substantially equalize data storage among the plurality of banks. Regarding entry location selection, the factors generally include: whether or not a match exists between the prospective entry, a table entry location of a next invalid entry, and a table entry location of a next sequential entry within one bank (in the case where all entries in a bank are valid).
It is an advantage of a preferred embodiment of the present invention that a fully associative table may be updated in a single cycle.
It is a further advantage of a preferred embodiment of the present invention that a plurality of linked fully associative tables may efficiently interact to rapidly update an overall system status.
It is a still further advantage of a preferred embodiment the present invention that more than one update to a fully associative table or array may be performed simultaneously.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.