1. Field of the Invention
This invention relates generally to microprocessor chips having caches and, more particularly, to microprocessor chips having a selectable size cache.
2. Discussion of the Related Art
A cache memory is a small, fast memory which is designed to store the most frequently accessed words from a larger, slower memory. The concept of a memory hierarchy with a cache as part of the hierarchy has been associated with the computer art from the earliest discussions of computer design. The idea of a memory hierarchy, with memory speed and memory size being the tradeoff, was discussed in John von Neumann's report on the design of the EDVAC (Electronic Discrete Variable Automatic Computer), which was written in the mid-1940s. The necessity for a cache with faster memory became more apparent because of the competition for the communication channel, which has become known as the "von Neumann bottleneck." This bottleneck and the inherent slowness of large memory components such as disk or tape drives became even more critical in the early 1960s when CPU (central processing unit) speeds could exceed memory speeds by a factor of 10. This factor has become even more critical as CPU speeds have increased by orders of magnitude over the increase in speed of memory since the 1960s.
Early studies and efforts to increase the speed and performance of computers revealed a principle that applied to computer programs called the principle of locality which has enhanced the need for caches and the value of caches in the design and operation of computers. The principle of locality states that programs access a relatively small portion of their address space at any instant of time. There are two different types of locality: (1) temporal locality (locality in time) which states that if an item is referenced, it will tend to be referenced again soon and (2) spatial locality (locality in space) which states that if an item is referenced, items in memory with close addresses will tend to be referenced soon. The design, implementation, and use of a memory hierarchy takes advantage of the principle of locality. A major consideration in the design of a memory hierarchy is that the fastest memories are more expensive per bit than the slower memories. Main memory is implemented using DRAM (dynamic random access memory) chips, while levels closer to the CPU, such as caches, use SRAM (static random access memory) chips. DRAM is less costly per bit than SRAM; however, it is substantially slower. The price difference arises because DRAM uses fewer transistors per bit of memory and, thus, DRAMs are less expensive to manufacture. Another factor in the design of memory systems is that SRAM use more silicon area for the same memory capacity.
Because of the differences in cost and access time, it is advantageous to implement a computer architecture with the memory built as a hierarchy of levels with the faster memory close to the microprocessor and the slower, less expensive memory further away from the microprocessor. The goal has been to present the users with as much memory as possible in the least expensive technology, while providing the speed required by the application. This tradeoff between cost and speed and, thus, performance is one of the major considerations in computer system design.
The three major technologies used to construct memory hierarchies are SRAM, DRAM, and magnetic disk. The access time and price per bit vary widely among these technologies as the table below indicates (using values typical for 1993):
______________________________________ Memory Typical access Technology time $ per MByte ______________________________________ SRAM 8-35 ns $100-$400 DRAM 90-120 ns $25-$50 Magnetic Disk 10,000,000-20,000,000 ns $1-$2 ______________________________________
It is clear from an examination of the values in the above table that a computer system design that incorporates unneeded or underutilized SRAM in a cache has a higher cost than required. On the other hand, too little SRAM in a cache will provide a lower cost device but at the expense of speed. It has been the tendency in the prior art to design a computer system with a microprocessor having a standard cache sized to meet the needs of a broad spectrum of users without regard to the specific use for which the microprocessor is to be used. Generic designs were used since design costs are high. But a microprocessor designed with a standard cache size is difficult to effectively fit into a price-performance requirement for a given application or a market. On the other hand, it is too expensive, in terms of design and development costs, to design a separate microprocessor for each price-performance demand of the market. Since the performance of a microprocessor depends upon the cache size and cache parameters, it is desirable to be able to produce microprocessors of varying levels of performance and size from the same basic design without the associated high cost of each individual new design effort.
What is needed is a design which will allow the designer to select a cache size which is optimum, in terms of a mix between cost and performance, for a target market. The designer should be able to consider factors such as the size of application programs that will likely be used, the number of different application programs that will likely be used, the necessity for speed, and the cost constraints of the system. The optimum solution would be for the designer to be able to change the cache size by making minimal changes to a basic design in order to easily create a new version of the basic design.