Memory bottlenecks are significant issues with respect to speed and processing capacity of modern computer systems. Manufacturers continue to increase processor clock speeds, bus speeds and widths, and memory sizes to improve system performance, but the processor cannot execute commands any faster than the memory can access the commands to transmit the commands to the processor. Thus, if the memory cannot be accessed as quickly as the processor can execute the commands, the processors will stall in spite of the increased clock and bus speeds and bus width, significantly impacting the system's overall performance.
Cache memories, or caches, are often used in many such systems to alleviate this problem and increase performance in a relatively cost-effective manner. Two main types of solid-state memories employed today are static random access memory (SRAM) and dynamic random access memory (DRAM). Conceptually, SRAM is implemented using a flip-flop as a storage device and DRAM is implemented using a capacitor as a storage device. SRAM is regarded as static because it does not need to be refreshed like DRAM. Leakage currents reduce charges stored by DRAM so DRAM needs to be refreshed periodically.
Both SRAM and DRAM offer random access capability and appear substantially identical to other system components, however, SRAM has a significantly lower, initial latency. For instance, SRAM may offer initial access times that are approximately ten times faster than the initial access times for DRAM. In addition, the cycle time for SRAM is much shorter than that of DRAM because SRAM does not need to pause between accesses. However, this lower, initial latency and shorter cycle time comes at the expense of a much lower storage density and much higher power dissipation. SRAM, for example, can have approximately ten times the power dissipation of DRAM. Furthermore, SRAM is significantly more expensive than DRAM, e.g., approximately ten times more expensive per bit.
The decision between implementation of SRAM or DRAM is usually a compromise between the speed requirements on one side and the storage density, physical space, power constraints, and the cost, on the other. In general, SRAM is implemented in systems for which access speed outweighs other considerations such as for cache memories and DRAM is implemented in systems for which high storage density outweighs other considerations such as main memory systems.
At present, computer systems, from servers to low-power embedded processors, typically incorporate SRAM as a first level cache L1 and often as second and third levels of cache, L2 and L3. L1 and L2 cache memory are typically incorporated on the die of or near a processor, enabling storage of frequently accessed data and instructions close to the execution units of the processor to minimize access latency. Ideally, as the time for execution of an instruction nears, the instruction and corresponding data are moved to the L2 cache from the more distant, main memory or L3 cache. Incorporation of SRAM as cache, however, sacrifices higher density storage.
Computer systems also incorporate DRAM as the main memory. The main memory acts as a buffer for data and instructions accessed from even higher latency, large capacity or bulk storage devices such as hard drives, compact disks, remote systems, or the like. While many improvements have increased data bursting speeds from DRAM devices, the initial latencies can be very large so incorporation of DRAM for the main memory sacrifices lower initial access times for higher density storage, which can significantly impact the overall system performance.