1. Technical Field
The present invention relates generally to a cache in a data processing system, and more particularly to a method and apparatus for dynamic cache memory allocation among data sets.
2. Description of the Related Art
A computer system typically includes an information processor coupled to a hierarchial stage stored system. The hardware can dynamically allocate parts of memory within the hierarchy for addresses deemed most likely to be accessed soon. The type of storage employed in each staging location relative to the processor is normally determined by balancing requirements for speed, capacity, and costs. Computer processes continually refer to this storage over their executing lifetimes, both reading from and writing to the staged stored system. These references include self-referencing as well as references to every type of other process, overlay or data. It is well-known in the art that data storage devices using high-speed random access memories (RAM) can be referenced orders of magnitude faster than high volume direct-access storage devices (DASD's) using rotating magnetic media. Such electronic RAM storage relies upon high-speed transfer of electrical charges over small distances, while DASD's typically operate mechanically by rotating a data storage position on a magnetic disk with respect to read-write heads. The relative cost of a bit of storage for DASD and RAM makes it necessary to use DASD for bulk storage and electronic RAM for processor internal memory and caching.
A commonly employed memory hierarchy includes a special, high-speed memory known as cache, in addition to the conventional memory which includes main memory and bulk memory. Cache memory speed increases the apparent access times of the slower memories by holding the words that the CPU is most likely to access. For example, a computer may use a cache memory that resides between the external devices and main memory, called a disk cache, or between main memory and the CPU, called a CPU cache.
The transfer of operands or instructions between main store and CPU cache, or bulk storage and the disk cache is usually effected in fixed-length units called blocks. A block of data may be transferred in varying sizes such as tracks, sectors, lines, bytes, etc., as are known in the art. When accessing of the disk allows retrieval of necessary data from the cache, such success is called a "hit", and when retrieval of necessary data cannot be performed in the cache, such failure is called a "miss".
A high-speed CPU cache enables relatively fast access to a subset of data instructions which were previously transferred from main storage to the cache, and thus improves the speed of operation of the data processing system. Cache memory may also be used to store recently accessed blocks from secondary storage media such as disks. This cache memory could be processor buffers contained in main memory or a separate disk cache memory located between secondary and main storage.
A disk cache is a memory device using a semiconductor RAM or SRAM and is designed to eliminate an access gap between a high-speed main memory and low-speed large-capacity secondary memories such as magnetic disk units. The disk cache is typically in a magnetic disk controller arranged between the main memory and a magnetic disk unit, and serves as a data buffer.
The principle of a disk cache is the same as that of a central processing unit (CPU) cache. When the CPU accesses data on disk, the necessary blocks are transferred from the disk to the main memory. At the same time, they are written to the disk cache. If the CPU subsequently accesses the same blocks, they are transferred from the disk cache and not from the disk, resulting in substantially faster accesses.
Since the disk cache capacity is smaller than that of the disk drive, not all data blocks that may be required by the CPU are always stored in the disk cache. In order for a new block to be loaded when the disk cache is full, blocks must be removed from the cache to make room for newly accessed data.
To enable retrieval of information from the cache, a list of entries associated with the cache is maintained in a directory which is an image of the cache. Each block residing in the cache has its tag or address, as well as other useful information, stored in an entry in the directory. Once the cache has been filled with data blocks, a new data block can only be stored in the cache if an old block is deleted or overwritten. Certain procedures are necessary to select blocks as candidates for replacement, and to update the directory after a change of the cache contents.
A well known and commonly used disk cache replacement algorithm is a Least Recently Used (LRU) algorithm. According to the LRU algorithm, the block which has stayed in the cache for the longest period is selected as the least necessary block. If a cache hit occurs as a result of the directory search, the entry in the cache directory corresponding to the "hit" cache block is set to the Most Recently Used (MRU) position in the list of cache entries maintained by the directory. If a miss occurs in a disk cache having no empty space, the cache memory must be assigned for new staging, so the least necessary data is removed to obtain an empty space. In the case of a cache miss, the LRU entry in the list, which would be in the bottom position of the list in a linked LRU list, is deleted from the list and a new entry is generated in the MRU position, the new entry corresponding to the block loaded into the cache as a result of the cache miss.
Although the LRU scheme performs well and is widely accepted, it has limitations. To effectively control the use of memory, it is necessary to distinguish among the various individual groups or types of data that may attempt to use the data cache. For example, in the extreme case, the cache may be "flushed" by a rapid succession of misses to data that has no locality. In this case, new data that does not benefit from the use of cache memory replaces older data which may have profited from the cache storage. Such a situation can arise under an LRU scheme and it tends to limit the effectiveness of the cache in cases where poor locality is present, especially if the cache size is small.
Since contention for memory and staging path resources can interfere with the effectiveness of the cache, cache controllers must manage these resources so as to mitigate the effects of contention. This is accomplished in some prior art by deciding, for each defined group of data which reaches the cache, whether this group will be allowed to use the cache memory. The groups of data as used in the prior art and as are used in the present invention are called "data sets" and are merely any logical grouping of data which facilitates memory allocation. As examples, data sets can be defined in terms of files of application data, ranges of device cylinders, a number of tracks, sectors, or lines, groups of data utilized by a single file or application, or by a functional distinction between groups, such as between instructions and data.
In some prior art storage controllers, data sets which are benefitting from use of the cache are allowed access to the memory cache, and data sets which are not benefitting from the cache are not staged. However, in realistic environments, there is a wide range in cache locality behavior even in groups of data which benefit from using the cache. This makes it highly desirable to control, not just whether a specific group of data will be permitted to use the cache memory, but how much cache memory that group will be permitted to use. In this way, larger amounts of memory can be provided to support those data sets which most benefit from the extra cache storage.
Various techniques have been proposed to accomplish this type of memory control, but their complexity make them impractical to implement. These techniques control the cache memory by partitioning it, so that each group of data is assigned the use of a particular partition. The partitions may be permanent, in which case complex analysis is required in advance in order to set the partition sizes, or the partition sizes are dynamically controlled, in which case complex statistical data gathering and boundary-adjustment algorithms are required.
Therefore, it would be desirable to provide a memory allocation technique for a cache controller which dynamically adjusts the amount of cache memory a particular data set will be permitted use, but where no partitioning of memory and the associated complex statistical data gathering and boundary-adjustment calculations are required.