1. Field of the Invention
This invention is related to the field of processors and, more particularly, to load processing and memory ordering maintenance in processors.
2. Description of the Related Art
Processors generally use memory operations to move data to and from memory. The term “memory operation” refers to an operation which specifies a transfer of data between a processor and memory (although the transfer may be accomplished in cache). Load memory operations specify a transfer of data from memory to the processor, and store memory operations specify a transfer of data from the processor to memory. Load memory operations may be referred to herein more succinctly as “loads”, and similarly store memory operations may be referred to as “stores”. Memory operations may be implicit within an instruction which directly accesses a memory operand to perform its defined function (e.g. arithmetic, logic, etc.), or may be an explicit instruction which performs the data transfer only, depending upon the instruction set employed by the processor.
Some instruction set architectures require strong ordering of memory operations (e.g. the x86 instruction set architecture). Generally, memory operations are strongly ordered if they appear to have occurred in the program order specified. Processors often attempt to perform loads out of (program) order to improve performance. However, if the load are performed out of order, it is possible to violate strong memory ordering rules.
For example, if a first processor performs a store to address A1 followed by a store to address A2 and a second processor performs a load to address A2 (which misses in the data cache of the second processor) followed by a load to address A1 (which hits in the data cache of the second processor), strong memory ordering rules may be violated. Strong memory ordering rules require, in the above situation, that if the load to address A2 receives the store data from the store to address A2, then the load to address A1 must receive the store data from the store to address A1. However, if the load to address A1 is allowed to complete while the load to address A2 is being serviced, then the following scenario may occur: (i) the load to address A1 may receive data prior to the store to address A1; (ii) the store to address A1 may complete, (iii) the store to address A2 may complete, and (iv) the load to address A2 may complete and receive the data provided by the store to address A2. This outcome would be incorrect.