With an ever widening performance disparity between processors and memory subsystems, hiding memory latency is becoming increasingly important. In general, whenever a system memory is accessed by a processor, there may be a potential for delay between the time a request for memory is made (either to read or write data) and the time when the memory access is completed. Generally, this delay is referred to as “latency” and can significantly limit the performance of the computer. There can be many sources of such latency. For example, operational constraints with respect to DRAM devices may cause the latency.
Typically, speed of memory circuits may be based upon two timing parameters. The first parameter may be the memory access time, which may be a minimum time required by a memory circuit to set up a memory address, produce, and/or capture data on or from a data bus. A second parameter may be a memory cycle time, which may be the minimum time required between two consecutive accesses to the memory circuit. Upon accessing the system memory, today's processors may have to wait for (e.g., 100 or more clock cycles) to receive the requested data. During this wait, the processor may be stalled, which can result in a significant reduction in processor performance.
Generally, extracting instruction level parallelism and/or better utilization of available processors resources are crucial to increase application performance. However, high memory latency can act as a hindrance to using these techniques. Typically, to avoid stalls due to long latency memory operations, processor architectures may permit some amounts of speculation, such as speculation of load instructions and/or set of instructions that use the loaded value across a possibly aliasing store that may be referred to as data speculation. The speculation of a load instruction and/or a set of dependant instructions across a conditional control flow edge are generally referred to as a control speculation.
Though speculation may help in most cases in reducing memory latency, there are still many situations where speculation of a load instruction can end up in a mispeculation. This can result in a significant processor performance hit, both in terms of execution of the recovery code and/or in terms of loading an unnecessary value in the memory. In case of the control speculation, an incorrect speculation can lead to page-faults as well.
Other features of the present embodiments will be apparent from the accompanying drawings and from the detailed description that follows.