1. Field of the Invention
The present invention pertains to the field of data storage. More particularly, this invention relates to cache memory subsystems.
2. Background
Computer technology is continuously advancing, resulting in microprocessors which operate at faster and faster speeds. In order to take full advantage of these higher-speed microprocessors, data storage capabilities must keep up with the increased speed. High speed memory, however, is very expensive, with the cost being further amplified by the large amount of memory which many modern software programs require.
One solution to the problem of expensive memory is that of a cache memory subsystem. A cache memory subsystem is a memory unit which is generally much smaller than the system memory unit but which operates at a significantly higher speed than the system memory. The goal of the cache memory is to contain the information (whether it be data or instructions) that the microprocessor is going to use next. This information can then be returned to the microprocessor much more quickly, due to the higher speed of the cache memory.
The operation of cache memory subsystems varies, however, in general, data is swapped between the system memory and the cache memory. When the microprocessor requests information from memory, for example, either an instruction it is going to execute or data related to an instruction, it sends the memory address of the desired information to the cache memory. If the cache memory contains the information, it issues a signal to the microprocessor indicating so; this signal is generally termed a "hit." The cache memory then returns the requested information to the microprocessor. Thus, the microprocessor receives the requested information more quickly due to the faster speed of the cache memory.
If, however, the cache memory does not contain the information requested by the microprocessor, then a signal, generally termed a "miss," is returned to the microprocessor. The miss indicates to the microprocessor that it must retrieve the information from the slower system memory. Alternatively, the cache memory controller may retrieve the information from the system memory, and return it to the microprocessor. Regardless of which subsystem retrieves the information from the system memory, the retrieved information is also stored in the cache memory. In order to store this information in the cache memory, however, other data in the cache may need to be overwritten. That is, other information may be contained in the location the new information is to be written into. In some systems, this situation is resolved by transferring the information stored in a particular location of the cache memory into system memory and transferring the information stored in system memory into that particular location of the cache memory.
Whether the cache memory must transfer the information in a particular location to the system memory is also dependent on the cache policy employed. For example, some cache policies transfer the information to the system memory whenever the information in the cache is updated. Thus, when retrieving new information from the system memory, information in the cache need not be transferred to the system memory.
The cache memory is generally much smaller than the system memory. Thus, only a portion of the memory address, referred to as the cache index, is used as an index into the cache memory. A second portion of the memory address, generally referred to as the "tag portion", is used to determine whether the information stored in the cache memory is the requested information. Thus, multiple system memory addresses reference the same slot in the cache memory. When the microprocessor requests a memory address which corresponds to a slot in the cache memory which is already used by another cache line, then a conflict occurs.
Cache memory subsystems are frequently divided into multiple cache lines, with the cache index portion of the memory address corresponding to one of these cache lines. Each cache line includes multiple bytes, with the particular byte requested by the microprocessor being indicated in the memory address as an offset. The system memory is also often divided into the same line size as the cache memory. These lines in the system memory are referred to as data lines.
One type of cache memory subsystem for resolving cache line conflicts is known in the art as a direct-mapped cache. In a direct-mapped cache, when a conflict occurs, the cache line stored in the cache is transferred to system memory and the data line corresponding to the request from the microprocessor is transferred to that location in the cache memory. Such a caching system has several advantages. First, the hardware complexity for implementing the cache is relatively simple. The tag of the location is compared to the request, and the cache line is returned to the microprocessor if they match, or the data is retrieved from system memory if they do not match.
Second, the cost of the cache memory subsystem is relatively small. The reasoning for this is two-fold. The reduced logic complexity discussed above reduces the financial cost of the system. In addition, the low complexity allows the cache memory to utilize static random access memory (SRAM) cells. SRAMs are widely available, and are inexpensive relative to many other types of memory cells.
The direct-mapped cache, however, performs poorly under certain circumstances. For example, memory address A and memory address B may both reference the same cache location, location X. If the microprocessor initially requests memory address A, then the data in address A is stored in location X. If the microprocessor requests address B on the next clock cycle, then the data in location X is returned to the system memory (at address A), and the data in address B is stored in location X. The next request by the microprocessor may then be for address A again. Thus, the data in location X is returned to the system memory (at address B), and the data in address A is again stored in location X. Therefore, a very poor hit ratio (that is, the number of cache hits relative to the total number of accesses to the cache) will occur if the microprocessor makes requests in the following order: address A, address B, address A, address B, address A, address B, etc. Thus, it can be seen that the performance of a direct-mapped cache suffers when the microprocessor alternately requests addresses A and B.
A second type of cache memory which resolves this performance disadvantage is a two-way set-associative cache memory. A two-way set-associative cache includes two "ways," which can be thought of as two direct-mapped caches operating together (for this reason, a direct-mapped cache is sometimes referred to as a one-way cache). In a two-way cache, if a conflict occurs, then the data stored in the first way of the cache is transferred to the second way of the cache, and the new data is retrieved from the system memory into the first way of the cache. Thus, both data lines are stored in the cache memory. Therefore, if the microprocessor continuously switches between requesting address A and address B as described above, each request will hit the cache, resulting in a higher hit ratio. Both data lines remain in the cache until a second conflict occurs; that is, a third request which accesses the same location.
Thus, it can be seen that the two-way cache resolves some of the performance problems in the direct-mapped cache. However, this increased performance has several costs. First, the logic complexity to operate a two-way cache is greater. Additional logic must be included to monitor both cache lines in both ways, and return the proper data when a request from the microprocessor hits the cache. Second, two-way caches generally use customized memory cells, rather than standard SRAMs. Thus, the financial cost of the cache system is increased.
A third cost consideration is that power consumption in a two-way cache is greater than that of a direct-mapped cache. The direct-mapped cache accesses only a single cache line to determine if a hit occurs. However, in a two-way cache, both cache lines are accessed to determine if a hit occurs, with the proper line being returned to the microprocessor if a hit does occur. Thus, it can be seen that the two-way cache utilizes twice the power of the direct-mapped cache, since it is accessing twice as many cache lines.
Thus, it would be advantageous to provide a cache memory subsystem which operates quickly to take advantage of the increased speed of modern microprocessors. The present invention provides such a solution.
Furthermore, it would be advantageous to provide a cache memory subsystem which had the advantages of both a direct-mapped cache and the increased performance of the two-way cache. The present invention provides a cache memory characterized as being less complex, having a lower cost, and lower power usage than the 2-way cache. The present invention also provides higher hit ratios as compared to conventional direct-mapped caches.