1. Field of the Invention
This invention generally relates to system-on-chip (SoC) off-SoC memory management and, more particularly, to a system and method for using a SoC hardware core to prefetch the data from a shutdown memory, being temporarily stored in a combination of slower memories, and using a hierarchy of memories and cache to attain power savings with nearly the speed of the shutdown memory.
2. Description of the Related Art
As noted in Wikipedia, a memory controller is a digital circuit that manages the flow of data going to and from the main memory. The memory controller can be a separate chip or integrated into another chip, such as on the die of a microprocessor. Computers using Intel microprocessors have conventionally had a memory controller implemented on their motherboard's Northbridge, but some modern microprocessors, such as DEC/Compaq's Alpha 21364, AMD's Athlon 64 and Opteron processors, IBM's POWER5, Sun Microsystems UltraSPARC T1, and more recently, Intel Core i7 have a memory controller on the microprocessor die to reduce the memory latency. While this arrangement has the potential of increasing system performance, it locks the microprocessor to a specific type (or types) of memory, forcing a redesign in order to support newer memory technologies. When the memory controller is not on-die, the same CPU may be installed on a new motherboard, with an updated Northbridge.
The integration of the memory controller onto the die of the microprocessor is not a new concept. Some microprocessors in the 1990s such as the DEC Alpha 21066 and HP PA-7300LC had integrated memory controllers, but to reduce the cost of systems by removing the requirement for an external memory controller instead of increasing performance.
Memory controllers contain the logic necessary to read and write dynamic random access memory (DRAM), and to “refresh” the DRAM by sending current through the entire device. Without constant refreshes, DRAM loses the data written to it as the capacitors leak their charge within a fraction of a second (not less than 64 milliseconds according to JEDEC standards).
Reading and writing to DRAM is facilitated by use of multiplexers and demultiplexers, by selecting the correct row and column address as the inputs to the multiplexer circuit, where the demultiplexer on the DRAM can select the correct memory location and return the data (once again passed through a multiplexer to reduce the number of wires necessary to assemble the system).
Bus width is the number of parallel lines available to communicate with the memory cell. Memory controller bus widths range from 8-bit in earlier systems, to 512-bit in more complicated systems and video cards. Double data rate (DDR) memory controllers are used to drive DDR SDRAM, where data is transferred on the rising and falling edges of the memory clock. DDR memory controllers are significantly more complicated than Single Data Rate controllers, but allow for twice the data to be transferred without increasing the clock rate or increasing the bus width to the memory cell.
Dual channel memory controllers are memory controllers where the DRAM devices are separated onto two different buses, allowing two memory controllers to access them in parallel. This dual arrangement doubles the theoretical amount of bandwidth of the bus. In theory, more channels can be built (a channel for every DRAM cell would be the ideal solution), but due to wire count, line capacitance, and the need for parallel access lines to have identical lengths, more channels are very difficult to add.
Fully buffered memory systems place a memory buffer device on every memory module (called an FB-DIMM when Fully Buffered RAM is used), which unlike conventional memory controller devices, use a serial data link to the memory controller instead of the parallel link used in previous RAM designs. This decreases the number of the wires necessary to place the memory devices on a motherboard (allowing for a smaller number of layers to be used, meaning more memory devices can be placed on a single board), at the expense of increasing latency (the time necessary to access a memory location). This latency increase is due to the time required to convert the parallel information read from the DRAM cell to the serial format used by the FB-DIMM controller, and back to a parallel form in the memory controller on the motherboard. In theory, the FB-DIMM's memory buffer device could be built to access any DRAM cells, allowing for memory cell agnostic memory controller design, but this has not been demonstrated, as the technology is in its infancy.
A DIMM, or dual in-line memory module, comprises a series of dynamic random access memory integrated circuits. These modules are mounted on a printed circuit board and designed for use in personal computers, workstations and servers. DIMMs began to replace SIMMs (single in-line memory modules) as the predominant type of memory module as Intel's Pentium processors began to gain market share.
The main difference between SIMMs and DIMMs is that DIMMs have separate electrical contacts on each side of the module, while the contacts on SIMMs on both sides are redundant. Another difference is that standard SIMMs have a 32-bit data path, while standard DIMMs have a 64-bit data path. Since Intel's Pentium has (as do several other processors) a 64-bit bus width, it requires SIMMs installed in matched pairs in order to complete the data bus. The processor would then access the two SIMMs simultaneously. DIMMs were introduced to eliminate this practice.
Serial ATA (SATA) is a computer bus interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives. Serial ATA was designed to replace the older ATA (AT Attachment) standard (also known as EIDE). It is able to use the same low level commands, but serial ATA host-adapters and devices communicate via a high-speed serial cable over two pairs of conductors. In contrast, the parallel ATA (the redesignation for the legacy ATA specifications) used 16 data conductors each operating at a much lower speed. SATA offers several compelling advantages over the older parallel ATA (PATA) interface: reduced cable-bulk and cost (reduced from 80 wires to seven), faster and more efficient data transfer, and hot swapping.
The SATA host adapter is integrated into almost all modern consumer laptop computers and desktop motherboards. As of 2009, SATA has mostly replaced parallel ATA in all shipping consumer PCs. PATA remains in industrial and embedded applications dependent on Compactflash storage although the new CFast storage standard will be based on SATA.
Flash memory is a non-volatile computer storage that can be electrically erased and reprogrammed. It is a technology that is primarily used in memory cards and USB flash drives for general storage and transfer of data between computers and other digital products. It is a specific type of EEPROM (Electrically Erasable Programmable Read-Only Memory) that is erased and programmed in large blocks; in early flash the entire chip had to be erased at once. Flash memory costs far less than byte-programmable EEPROM and therefore has become the dominant technology wherever a significant amount of non-volatile, solid state storage is needed. Example applications include PDAs (personal digital assistants), laptop computers, digital audio players, digital cameras and mobile phones. It has also gained popularity in console video game hardware, where it is often used instead of EEPROMs or battery-powered static RAM (SRAM) for game save data.
Since flash memory is non-volatile, no power is needed to maintain the information stored in the chip. In addition, flash memory offers fast read access times (although not as fast as volatile DRAM memory used for main memory in PCs) and better kinetic shock resistance than hard disks. These characteristics explain the popularity of flash memory in portable devices. Another feature of flash memory is that when packaged in a “memory card,” it is extremely durable, being able to withstand intense pressure, extremes of temperature, and even immersion in water.
Although technically a type of EEPROM, the term “EEPROM” is generally used to refer specifically to non-flash EEPROM which is erasable in small blocks, typically bytes. Because erase cycles are slow, the large block sizes used in flash memory erasing give it a significant speed advantage over old-style EEPROM when writing large amounts of data.
In summary, DIMMs are fast, consume relatively large amounts of power, and have a high bit density. Flash memories are slower than DIMMs, but faster than SATA or SSDs (solid state drives), consume less power, and have a lower memory density. SATA are mechanical, making them the slowest memory. They burn more power than a DIMM when active. However, a SATA drive has lots of storage and can be turned on and off, to save power, without losing data.
As noted above, most computing devices are currently built using DIMM type RAM memories. There are many occasions when a computing device is turned on, but not accessing memory. Keeping the DIMM memory “alive” in these conditions is wasteful of power. The power issue can be especially critical if the computing device is battery operated or sensitive to high operating temperatures. Currently, there is no technology able to shutdown or replace memory devices on-the-fly. Some problems that prevent such an operation include the possibility of data corruption, operating system inflexibility, and signal integrity or electrostatic discharge (ESD) issues. Even in the case when the operating system (OS) is hibernating, DIMMs cannot be removed, as the OS and the basic input/output system (BIOS) always look for the exact same memory state that existed prior to hibernation.
It would be advantageous if at least some of a computer device's memories could be put into a sleep mode when the system determines limited memory read/write access is required.
It would be advantageous if data from a shutdown memory could be written into cache when there is a high likelihood that is to be requested by a processor.