1. Technical Field
This invention relates generally to computer systems having data and/or instructions stored in a memory, and more particularly to such systems in which the data and/or instructions can also be temporarily stored from the memory into a cache.
2. Description of the Prior Art
Most modern computer systems include a processor and memory, among other components. Data and instructions required for processing by the processor are retrieved from the memory. The processor then stores the results of its processing back into the memory. Among different types of processors in a given computer system, there usually is a central processing unit (CPU), which is the main processor for the system.
Memory access by the processor, however, can be slow. Generally, there is a latency associated with each kind of memory, which refers to the length of time from when a processor first requests data or an instruction stored in the memory, to when the processor actually receives the data or the instruction from the memory. Different memory locations within a computer system may have different latencies. Usually the processor itself can process instructions and perform actions faster than the memory can provide data and instructions to the processor. This leads to a bottleneck within the computer system.
To alleviate this problem, many computer systems include one or more caches. A memory cache, or processor cache, is a memory bank that bridges the main memory and the CPU. It is faster than the main memory and allows instructions to be executed and data to be read at higher speeds. Instructions and data may be transferred to the cache in blocks, using a look-ahead algorithm. The more sequential the instructions in the routine being accessed, and the more sequential the order of the data being read, the greater the chance the next desired item will still be in the cache, and the greater the improvement in performance. Data reuse also contributes to cache effectiveness. The more often data is reused the higher the probability it will be in the cache. If data is used infrequently or there is a lot of time between its recurring uses then there is low probability it will remain in the cache.
Two common types of caches are known as level 1 (L1) cache and level 2 (L2) cache. An L1 cache is a memory bank built into the processor itself. An L2 cache is a secondary staging area that feeds the L1 cache, and is separate from the actual processor. Increasing the size of the L2 cache may speed up some applications but may have no effect on others. An L2 cache may be built into the same chip as the processor, reside on a separate chip in a multi-chip package module, or be a separate bank of chips. Caches are typically static random-access memory (SRAM), whereas main memory is generally some variety of slower, more dense dynamic random-access memory (DRAM). Caches can also be divided into two or more sets. Any given line from memory can be typically stored in only one of the sets. Such an organization limits the number of locations that must be checked to determine if a line is present in the cache. When a line is added to the cache a line in the same cache set must be chosen for replacement.
Standard cache allocation policies replace the contents of the cache set usually without regard to memory utilization or latency. For example, a least recently used (LRU) policy may replace the data or instruction that was least recently used with a new data or instruction that has been retrieved from memory. Such policies do not concern themselves with how often the newly stored data or instruction may actually be accessed, nor with the latency of retrieving this data or instruction from the memory itself. This can lead to a slowdown in system performance, due to ineffective use of the cache.
For example, in streaming data applications, such as streaming video or streaming audio applications, the data retrieved from the memory is usually used only once, and then not used again. A typical cache will dutifully cache this data, however, as it is retrieved by the processor. This negates the usefulness of the cache, because the cache is desirably meant for data that will be used often by the processor. Online transaction processing also tends to have large sets of data that are used infrequently and a small set of data that is used often.
As another example, some memory of the system may be high-speed memory, with relatively low latencies. Caching the contents of such memory, as compared to caching the contents of higher-latency memory, may cause an overall decrease in system performance. This is because the performance benefits of caching lower-latency memory are less than those of caching higher-latency memory. A typical cache, however, does not discern the latency of memory when caching data or instructions from the memory.
For these described reasons, as well as other reasons, there is a need for the present invention.
The invention relates to caching memory contents differently based on the region to which the memory has been partitioned or allocated. In a method of the invention, a first region of a first line of memory to be cached is determined. The memory has a number of regions, including the first region, over which the lines of memory, including the first line, are partitioned. Each region has a first variable having a corresponding second variable. Based on comparison of the first variable for any region with its corresponding second variable, one such region is selected as a second region. A line from the lines of the memory currently stored in the cache and partitioned to the second region is selected as the second line. The second line is replaced with the first line in the cache, the first variable for at least one of the regions is changed.
In a system of the invention, there is a cache, a number of regions, a first variable for each region, a corresponding second variable for each first variable, and a mechanism. The cache is for caching lines of memory. A first line of memory is to be substituted for a second line of memory currently stored in the cache. The lines of memory are partitioned over the regions. The regions include a first region to which the first line is partitioned, and a second region to which the second line is partitioned. The first variable for each region tracks the number of lines partitioned to this region that are currently stored in the cache. The corresponding second variable for the first variable for each region indicates a desirable maximum number of lines partitioned to this region that should be stored in the cache. In response to determining that the first variable for any region is greater than its corresponding second variable, the mechanism selects one such region as the second region, and a line from this region as the second line.
An article of manufacture of the invention includes a computer-readable medium and means in the medium. The means in the medium is for substituting a first line of a memory for a second line of the memory currently stored in the cache. The means accomplishes this by determining that a first variable for any of a number of regions over which lines of the memory are partitioned is greater than a corresponding second variable. In response, the means selects as a second region one of the regions for which the first variable is greater than the corresponding second variable, and selects as the second line one of the lines currently stored in the cache and partitioned to the second region. Other features and advantages of the invention will become apparent from the following detailed description of the presently preferred embodiment of the invention, taken in conjunction with the accompanying drawings.