1. Field of the Invention
This invention relates generally to data processing systems and, more particularly, to systems and methods for executing load and store instructions.
2. Description of the Related Art
Many modern processors (e.g., microprocessors) include load/store units for executing load instructions and store instructions. In general, a “load instruction” copies data from a specified location in a main memory to a register in a processor, and a “store instruction” copies data from a register in a processor to a specified main memory location.
In order to boost processor performances, the load/store units of many modern processors are adapted to support out of order executions of load and store instructions. A memory consistency model typically determines an order in which memory operations (e.g., load and store instructions) specifying the same memory locations must be carried out to achieve program correctness. If the ordering of load and store instruction executions is relaxed, program correctness problems occur.
For example, if two load instructions to the same address are executed out of order, and the value of the data at that address is changed between the executions of the two load instructions (e.g., by another processor), the later (i.e., younger) load will obtain an earlier (i.e., old) value, and the earlier (i.e., older) load will obtain a later (i.e., new) value. This situation is termed a “load-load order violation” or a “load-hit-load hazard.” The requirement that if a younger load instruction obtains old data, an older load instruction to the same address must not obtain new data is termed “sequential load consistency.” In addition, if a later (i.e., younger) load instruction is executed before an earlier (i.e., older) store instruction to the same address (i.e., memory location) is completed, the load instruction will obtain an earlier (i.e., old) value. This situation is termed a “load-store order violation” or a “load-hit-store hazard.” (See, for example, “Power4 System Microarchitecture” by J. M. Tendler et al., IBM Journal of Research and Development, Volume 46, Number 1, January 2002, pp. 5-25.) Some modern processors have dedicated hardware to avoid load-load and load-store order violations, thereby helping to ensure program correctness.
Some memory consistency models, including the “weak ordering” memory consistency model, relax ordering constraints involving memory operations specifying the same memory locations. In particular, the weak ordering memory consistency model classifies memory operations into two categories: “data operations” and “synchronization operations.” A programmer typically divides a computer program into sections of code, including data operations that can be reordered or overlapping without affecting program correctness, separated by synchronization operations. A synchronization operation is typically not issued until all previous data operations are complete, and subsequent data operations are typically not issued until the synchronization operation is complete.
“Multithreading” refers to the ability of a computer system to execute different parts of a program, called threads of execution or simply “threads,” simultaneously. A programmer typically divides a computer program into multiple “threads” including instructions that can be executed at the same time without interfering with each other.
A problem arises with dedicated hardware added to help ensure program correctness in that such hardware is typically complex and adds time delays. In view of the push toward higher processor clock frequencies and performance levels, it would be desirable to have relatively simple methods for executing instructions that help ensure program correctness and can be implemented using a relatively small amount of additional hardware.