1. Technical Field
The present invention relates in general to data processing and in particular to data processing systems and improved memory subsystems and memory controllers for data processing systems. Still more particularly, the present invention relates to a method and system for thread-based speculation in a memory subsystem of a data processing system.
2. Description of the Related Art
Symmetric Multi-Processor (SMP) computer systems have conventionally been implemented with multiple processor chips coupled by a tri-state bus to a single common memory controller controlling access to one or more DIMMs (Dual Inline Memory Modules). Because of the lack of scalability and high access latency associated with this conventional configuration, more recent multiprocessor computer systems have migrated to a system-on-a-chip (SOC) paradigm in which multiple processing units are coupled together by a switch and each processing unit die contains multiple processor cores supported by one or more levels of cache memory and an integrated memory controller coupled to multiple external DIMMs. Because each SOC processing unit die includes its own integrated memory controller, scalability is improved over earlier SMP architectures. However, although absolute memory latency is reduced for the percentage of memory accesses to addresses mapped to physically closer DIMMs, improvements in average memory access latency for current SOC-based system designs still does not scale with ever-increasing processor clock frequencies.
In addition to the foregoing memory subsystem design trends, enhancements have also been made to processor core designs to decrease the average cycles per instruction (CPI) by improving the manner in which the processor core manages memory accesses. In particular, these enhancements include support for highly out-of-order instruction execution, multilevel branch speculation, simultaneous multithreading (SMT), and speculative data and instruction prefetching. The intent of each of these features is to mask apparent memory access latency by initiating retrieval of data from the memory subsystem in advance of need. All of these enhancements reflect a common “consumer-controlled” design philosophy in which an increasing amount of logic in the processor core is devoted to controlling access to the memory subsystem, resulting in more complex and larger processor cores.