A computer system is a machine that manipulates data according to a list of computer instructions. A list of computer instructions created to solve a particular problem is generally referred to as a computer program. In general, a computer system sequentially processes the individual instructions that may access, manipulate, and store data. A type of computer instruction known as a ‘branch instruction’ allows the flow of the computer program to vary depending on the input data.
A general purpose computer system has four main sections: a control unit, an arithmetic and logic unit (ALU), a memory system, and some type of input and output system. The control unit is responsible for the overall operation of fetching computer instructions from the memory system and executing those computer instructions. The arithmetic and logic unit generally consists of a set of computer registers that contain data and may be combined and compared in various manners according to the specific computer instructions. The results of comparisons may direct the control unit on which computer instructions should be executed next. The input and output system provides the computer system with a means of interacting with the outside world.
In most modern computer systems, the control unit, the arithmetic and logic unit (ALU), and a small subset of a memory system are combined into a single entity known as a central processing unit (CPU). Central processing units are generally implemented on a single integrated circuit in order to optimize the processing speed, the rate at which the computer system can execute instructions, of the computer system.
The small subset of a memory system that is often implemented on the same integrated circuit die allows the control unit and ALU to access the data in that small subset of memory very quickly since that subset of the memory system is generally implemented with a high-speed memory design (generally static random access memory devices also known as SRAM) and is physically close to the control unit and ALU. This small subset of the memory system is generally referred to as an ‘On-chip cache memory system’. However, since modern operating systems and application programs are generally very large, the vast majority of a memory system (the main memory system) is generally implemented on separate memory integrated circuits that are coupled to the processor.
The main memory system for a modern computer system on separate integrated circuits is generally implemented with a different memory circuit implementation that provides much higher memory density (more memory bits stored per integrated circuit layout area) than the on-chip cache memory. For example, dynamic random access memory devices (DRAM) are generally used to construct main memory systems. These DRAM devices are generally not as fast as the SRAM devices used within on-chip cache memory. Furthermore, simply accessing separate memory integrated circuits not on the same integrated circuit is generally slower than accessing on-chip cache memory since the communication across a much longer conductor to the external memory device cannot operate at the same high frequency as the CPU core. Thus, when a CPU needs to access data from off-chip main memory system, the CPU may be forced to stall or operate at a rate slower than the potential maximum operating rate of the CPU.
The speed at which central processing units (CPUs) operate have been continually increasing. Specifically, decreasing the size of the semiconductor transistors and decreasing the operating voltages of these transistors has allowed processor clocks to run at faster rates. However, the performance of external memory systems that provide data to these faster processors have not kept pace with the increasingly faster CPUs. Various techniques such as larger on-chip cache memories, greater parallelism, and larger off-chip cache memories have helped mitigate this issue. However, there are still many occasions when a CPU is not achieving its full potential due to external main memory systems that cannot respond to memory requests from the CPU as fast as the CPU can issue these memory requests. Thus, without sufficiently fast memory systems, a very high-speed CPU will be starved of instructions and data to process and thus forced to stall while waiting for data from the main memory system. Thus, it is desirable to improve the speed of memory systems such that memory systems can handle memory read and write operations as fast as possible.