Advanced CPU's and embedded processors are achieving higher performance as time goes on. However, memory subsystems are requiring lower latency and more bandwidth to sustain performance. Dynamic random access memory (DRAM), for example, is getting faster in clock speed, wider in bus size, and larger in capacity. As a result, DRAM is consuming more power and generating more heat. The wider bus effectively increases the memory subsystem power consumption linearly, whether it is for embedded appliances, Desktop/Notebook PC's, or high-density Server applications.
A CPU is the computing and control hardware element of a computer-based system. In a personal computer, for example, the CPU is usually an integrated part of a single, extremely powerful microprocessor. An operating system is the software responsible for allocating system resources including memory, processor time, disk space, and peripheral devices such as printers, modems, and monitors. All applications use the operating system to gain access to the resources as necessary. The operating system is the first program loaded into the computer as it boots up, and it remains in memory throughout the computing session.
Typical PC systems use either 64-bit or 128-bit DRAM subsystems. In the latter case, the memory subsystem is usually organized as two independent 64-bit memory controllers (MC). Various types of DRAM may be powered down through either a physical power-down signal, such as a clock enable CKE signal, or through a packetized power-down command sent through a high-speed serial bus.
For double data rate (DDR) synchronous DRAM, for example, de-asserting a CKE signal (low) puts the corresponding memory row of the DRAM into a power down state. Asserting the CKE signal (high) brings the memory row back to a full operating state. The CKE signal may be dynamically toggled on every rising edge of the SDRAM clock.
A typical 64-bit memory controller (MC) may support between two and four SDRAM dual in-line memory modules (DIMM). Each DIMM has up to two memory rows (each side of a double-sided DIMM is called a memory row), and each memory row may have multiple internal memory banks. Each bank comprises multiple memory pages, one page from each DRAM chip of the memory row.
Typically, if a MC may put each memory row of multiple DIMM modules independently and dynamically into and out of the power down states using the CKE signal, then the MC is said to support dynamic CKE DRAM power management. However, dynamic CKE is typically supported only in power-sensitive appliances such as notebook PC's or PDA's and is not available for desktop PC's for various reasons.
Even for mobile designs, system designers have not been aggressive in DRAM power management since it would mean turning on an auto pre-charge option that pre-charges and closes a given DRAM bank after every access if there is no pending access to the bank. However, if the CPU or a bus master initiates an access to the same bank after it has been closed, a longer latency will be incurred due to row-to-column delay. If an access is initiated immediately after the auto pre-charge is started, an additional delay will be incurred due to the pre-charge.
It is known that some MC's perform selective auto pre-charging that use least recently used (LRU) or other algorithms to close only those rows that are most unlikely to be accessed next, in order to minimize incurred latencies. It is also known that some implementations look into a read/write command FIFO to determine which banks to close to minimize the latency impact. This may be effective but still cannot predict which memory banks will be accessed next. Some power management schemes also use certain statistical and prediction methods to determine which memory banks will be accessed next but are not maximally effective.
An operating system may keep track of the percentage of time that the CPU is idle and writes the idle percentage value to a register. For example, the CPU may have been idle for about 40% of a last predefined time period. Different operating systems use different windows of time to compute the idle percentage value. Older operating systems have longer idle loops. Newer operating systems have shorter idle loops in order to accommodate as many tasks as possible running simultaneously.
In most systems, the performance of the processor may be altered through a defined “throttling” process and through transitions into multiple CPU performance states.
Certain CPU power management schemes are known which use statistical methods to monitor CPU host interface (sometimes known as Front-Side Bus) activities to determine average CPU percent utilization and set the CPU throttling accordingly. However, advanced CPUs incorporate large cache memory that hide greater than 90% of the CPU activities within the CPU core. Therefore, the FSB percent utilization has little correlation to the actual core CPU percent utilization. As a result, prior implementations cannot correctly predict idle states of CPUs with super-pipelined architectures and integrated caches.
If it is not known, in a most effective way, when the CPU may be powered down, then it is not known when the CPU may issue any additional read/write accesses to memory. Therefore, the memory may not be powered down most effectively because, once the CPU issues a memory access, if the memory is powered down, performance may be jeopardized.
It is desirable to know, in an efficient manner, when the CPU is idle and the states of various memory-related functions in order to most effectively power down portions of the memory subsystem without comprising system performance.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with embodiments of the present invention as set forth in the remainder of the present application with reference to the drawings.