In a typical computing device, load and store instructions handle all data movement between processor registers, memory and peripherals. Load instructions are used to load data from memory into a processor register. Store instructions, on the other hands, are used to store data from a processor register into memory. Both types of instructions specify a data effective address which identifies the address in memory where the data to be stored or loaded is located.
Load-hit-store (LHS) conflicts are a common source of performance issues on POWER™ processors. LHS conflicts occur when a load instruction instructs a processor to load data from an address before the data has been stored to the address by a store instruction.
Often, functions which are only a few steps cause LHS conflicts because the function prologue (store instruction) and the function epilogue (load instruction) are temporally close. In many cases, static and dynamic compilers can resolve the LHS conflicts by inlining the function code. Inlining denotes the process of inserting the complete body of a function in every place that the function is called, which eliminates the time overhead associated with the function call. Replacing the function call with the body of the function results in the removal of the function prologue (store instruction) and function epilogue (load instruction) from the function code since the purpose served by the function prologue (to save the data present in the registers before execution of the function in memory) and the function epilogue (to load the saved data back into the registers) is no longer necessary. Removing the function prologue (store instruction) and the function epilogue (load instruction) from the code, eliminates the possibility of a load-hit-store conflict occurring. Typically, a compiler can eliminate an LHS conflict, via inlining or a similar process, if it can detect the store and load instruction pair that is at the root of the LHS conflict. When the store and load instruction pair is spatially close they can be easily identified by the compiler.
For some LHS conflicts, where the store instruction and the load instruction are not spatially close but still temporally close enough in execution that a LHS conflict is caused, it can be hard to identify the store/load instruction pair causing the LHS conflict. In current POWER™ processors, there are mechanisms where the code is profiled in order to identify a load that causes an LHS conflict. However, there are no code profiling mechanisms to identify the corresponding stores involved in the LHS conflict. Without this information, the compiler has to examine all previous stores until a store is found whose address matches the data address specified in the load instructions. Once the compiler finds the matching store, inlining or a similar process may be used to resolve the LHS conflict. However, searching through previous stores is inefficient and can increase the overhead for static and dynamic compilers immensely.