1. Field of the Invention
This invention relates generally to data processing systems and, more particularly, to systems and methods for executing load and store instructions.
2. Description of the Related Art
Many modern processors (e.g., microprocessors) include load/store units for executing load instructions and store instructions. In general, a “load instruction” copies data from a specified location in a main memory to a register in a processor, and a “store instruction” copies data from a register in a processor to a specified main memory location.
In order to boost processor performances, the load/store units of many modern processors are adapted to support out of order executions of load and store instructions. A memory consistency model typically determines an order in which load and store instructions specifying the same memory locations must be carried out to achieve program correctness. If the ordering of load and store instruction executions is relaxed, program correctness problems occur.
For example, if two load instructions to the same address are executed out of order, and the value of the data at that address is changed between the executions of the two load instructions (e.g., by another processor), the later (i.e., younger) load will obtain an earlier (i.e., old) value, and the earlier (i.e., older) load will obtain a later (i.e., new) value. This situation is termed a “load-load order violation” or a “load-hit-load hazard.” The requirement that if a younger load instruction obtains old data, an older load instruction to the same address must not obtain new data is termed “sequential load consistency.” (See, for example, “Power4 System Microarchitecture” by J. M. Tendler et al., IBM Journal of Research and Development, Volume 46, Number 1, January 2002, pp. 5-25.) Some modern processors have dedicated hardware to avoid load-load order violations, thereby achieving sequential load consistency and helping to ensure program correctness.
A problem arises in that such dedicated hardware is typically complex and adds time delays. In view of the push toward higher processor clock frequencies and performance levels, it would be desirable to have a relatively simple method for executing load instructions that avoids load-load order violations to achieve sequential load consistency and can be implemented using a relatively small amount of additional hardware.