Field of the Invention
The present invention is related to shared memory, and more particularly, to memory shared by multiple processors and efficient allocation and use of the memory by the processors.
Background Description
Semiconductor technology and chip manufacturing advances have resulted in a steady increase of Central Processing Unit (CPU), or processor, processing power and memory performance, allowing packing more function in the same or smaller chip area, i.e., density. Generally, these densely packed chips are much more powerful and consume much more power for each given unit of chip area. Although a number of factors determine computer system performance, performance is primarily the result of the particular CPU and memory performance.
In theory X processors improve performance by a factor of X. So, a typical high performance computer system increases performance by increasing the number of processors, e.g., in a multiprocessor system, sharing correspondingly larger high-performance main memory as well. Both Intel® and Advanced Micro Devices (AMD), Inc., for example, offer off-the-shelf, multi-processors (multiple core processors) for PCs and the like, currently with as many as 8 cores. A state of the art high performance PC with such an 8-core multi-processor, for example, might be equipped with 32 gigabyte (32 GB) or more of main memory; some form of non-volatile storage, e.g., a Hard Disk Drive (HDD) or a Solid State Disk Drive (SSDD); a display capability (e.g., integrated on board); and, any additional feature cards. These multi-core processors have found use even in what was once considered low end, state of the art mobile applications, such as the iPhone® or iPad® from Apple, Inc.
While state of the art multi-core PCs may dedicate cache memory for each core, on or off chip or module, the cores share a much larger main memory. During normal operation each core may be running one or more applications in one or more threads and/or providing one or more virtual machines. As each application/processor thread opens the respective processor requests memory from the main memory, and usually receives a memory space allocation sufficient to satisfy the request. Although processor speed is the main performance determinant, a fast processor can only take full advantage of its speed with equally fast memory. For example, one rule of thumb is that replacing relatively slow memory in a Personal Computer (PC) with higher performance memory, e.g., 30-50% faster, improves average performance by 10-20%.
A typical memory controller for such main memory (PC or mobile device) is selected/designed to treat all memory in main memory identically. So, if memory on one Dual Inline Memory Module (DIMM) is slower than the others, the controller operates all of the DIMMs at that the slower speed. For example, for 4 DIMMs with 3 capable of 800 ns bus speeds and 1 only capable of 500 ns bus speeds, the controller would run all 4 at 500 ns. These state of the art systems also have allocated memory to all processors/cores for all applications/threads regardless of individual application/thread performance requirements. As a system user opened more and more, the concurrent activity and memory allocation could rise to a point that tended to stress shared memory capabilities.
Adding memory and function in older technologies also had increased power requirements, much of which has increased integration has alleviated. In older PCs, for example, adding many functions required adding system boards, e.g., sound, a Network Interface card or Circuit (NIC), modem and a display adapter. These functions have been integrated into single on-board (on motherboard) chips or parts of chips, to reduce overall system power. Also, disk drives have required significant power, much of which can be eliminated by using SSDDs. SSDDs use well known Non-Volatile Random Access Memory (NVRAM) or flash memory as hard disk space. SSDDs have improved non-volatile storage (disk) performance to near Dynamic RAM (DRAM) performance. In mobile devices where both size and power are constrained, among other things by mobile device package size, battery life, and minimal cooling capacity, much lower power NVRAM has replaced high performance, high power DRAM.
While technology has reduced power required for individual function, adding more and more function has increased system power requirements. So for example, an eight core processor consumes on the order of one hundred twenty five watts (125 W) and system RAM consumes another 30 W. While memory chip capacity normally quadruples with each generation, at times system memory requirements have outpaced chip capacity increases. Without a change in technology generation, increasing main memory capacity has involved adding more memory chips/DIMMs to the system.
As noted hereinabove, adding components (DIMMs) increases space requirements and power consumption. The more power that system components consume, the higher the power supply capacity required and the more the system requires costly cooling components. Kingston® Technology, for example, offers water-cooled high-performance DIMMs. This all adds to system cost.
Thus, there is a need for reducing system main memory real estate and power consumption and more particularly increasing system main memory capacity and density while reducing system memory real estate and power consumption.