1. Field
The present disclosure pertains to the field of information processing, and, more specifically, to the field of memory access management.
2. Background
In some prior art microprocessors or processing systems, information (data or instructions) may be accessed by a microprocessor using operations such as “load” operations or “store” operations. Furthermore, load and store operations may be performed in response to an instruction (or sub-instruction, such as a micro-operation, or “uop”) being executed by a processor. In some processing architectures, load instructions may be decoded into one uop, whereas store instructions may be decoded into two or more uops, including a store address (STA) uop and a store data (STD) uop. For the purpose of this disclosure both store uops and instructions will be referred to as “store operations” or “stores” and load uops and instructions will be referred to as “load operations” or “loads”.
In some processors or processing systems, a number of load and store operations may be executed, or otherwise pending, concurrently. For example, in a pipelined processor containing multiple processing stages that may each operate on different operations concurrently, there may be several load and store operations being performed concurrently, each at a different stage within the pipeline. However, at various pipeline stages, the address from where data is to be loaded by load instructions or to where data is to be stored by store instructions (collectively referred to as “target address”) is unknown, or “ambiguous”. This is because the target address of load and store instructions or uops are sometimes determined after the load or store has already begun to be executed.
FIG. 1 illustrates a portion of a pipelined processor having a fetch/prefetch stage, one or more rename units to assign registers to appropriate instructions or uops, and one or more scheduling units/reservation station units to schedule and store instructions or uops, such as uops corresponding to loads and stores, until their respective target addresses are determined.
When load and stores (e.g., STA uops) are dispatched from the reservation station, they may be sent to the address generation unit, which generates a corresponding linear address for the load and stores to be sent to memory or cache. Load operations are typically dispatched from the reservation station into a load buffer within memory ordering buffer (MOB), where the loads are checked for conflicts and dependencies with other store operations. If no conflicts or dependencies with stores exist, the load may be dispatched to the memory/cache cluster. Otherwise, the load may have to wait in the MOB until the dependencies and/or conflicts are resolved before being dispatched to memory/cache.
Once the loads are dispatched to memory/cache, the memory/cache may return data targeted by the load to the execution unit reservation station, which may use the loaded data to generate an address to the next operand of some successive uop to be dispatched from the scheduler/reservation station.
Store operations, which may include STA uops, may follow a similar path as loads. However, stores are not typically allowed to be dispatched to the memory/cache out of program order, whereas loads may be dispatched to memory/cache anytime no dependencies/conflicts exist between the loads and other store operations.
In some prior art processors, the MOB is used to store load and store operations in proper order, such that all store operations to write information to a memory location are dispatched and allowed to write their information to memory before load operations that may use information from the same address. Store operations appearing in program order before corresponding load operations (i.e. load operations having the same target address as the earlier store operations) may be referred to as “older” store operations and the corresponding load operations may be referred to as “newer” load operations than the earlier store operations in program order.
Loads may access memory out of program order in relation to stores if no dependencies/conflicts between the loads and stores exists. In some of the prior art, loads being processed before older pending stores were assumed to always correspond to the same target memory address in order to prevent the chance that an earlier processed load could load data that was to be updated by the older store, and therefore produce an incorrect result in whatever program they corresponded to by returning obsolete information.
However, this assumption may prove to be too conservative, in as much as not all loads that are processed before older pending stores in program order are processed correspond to the same memory address. As a result, loads may be delayed from being issued to memory for numerous cycles until the corresponding older pending stores are processed and stored in the proper order in the MOB. This can, in turn, cause unnecessary delays in memory access time, which can unduly erode processor and system performance.