1. Technical Field
The present invention relates generally to an improved data processing system, and in particular, to an improved method and apparatus for caching data in a memory. Still more particularly, the present invention relates to a method and computer system design for handling bad victim selection during LRU victim selection at a caching mechanism.
2. Description of Related Art
Most early data processing systems consisted basically of a central processing unit, a main memory, and some sort of secondary input/output (“I/O”) capability. In these earlier systems, the main memory was the limiting element. Typically, the main memory was designed first and the CPU was then created to match the speed of the memory. This matching was performed to optimize the processing speed and is necessary even with today's high speed computers. Over time, logic circuit speeds increased along with the capacity requirements of main memory. With the need for increasing capacity in the main memory, the speed of the main memory could not keep up with the increasing speed of the CPU. Consequently, a gap developed between the main memory and the processor cycle time, which resulted in un-optimized processing speeds. As a result, a cache memory was developed to bridge the gap between the memory and the processor cycle time.
Using a cache to bridge the performance gap between a processor and main memory has become important in data processing systems of various designs, from personal computers to work stations to data processing systems with high performance processors. A cache memory is an auxiliary memory that provides a buffering capability through which a relatively slow main memory can interface with a processor at the processor's cycle time to optimize the performance of the data processing system. Requests are first sent to the cache to determine whether the data or instructions requested are present in the cache memory. A “hit” occurs when the desired information is found in the cache. A “miss” occurs when a request or access to the cache does not produce the desired information. In response to a miss, one of the cache “lines” is replaced with a new one. The method to select a line to replace is called a replacement policy.
A number of different schemes for organizing a cache memory exist. For example, a fully associative mapping organization may be employed whereby a data address may exist in any location in the cache, or a direct mapping scheme may be employed in a cache memory whereby a data address may exist in only one location in the cache. A set associative scheme may be employed by partitioning the cache into distinct classes of lines, wherein each class contains a small fixed number of lines. This approach is somewhere between a direct mapped and a full associative cache. The classes of lines are usually referred to as “congruence classes.” The lines in a congruence class are usually referred to as sets (which indicate the number of locations an address can reside) in a congruence class in a set associative cache.
One generally used type of replacement policy is the least recently used (LRU) policy. An LRU policy is built upon the premise that the least recently used cache line in a congruence class is the least worthy of being retained. So, when it becomes necessary to evict a cache line to make room for a new one, an LRU policy chooses as a victim a cache line which is the least recently accessed set (or member) within a congruence class.
For an LRU policy, two types of operations must be carried out against the LRU state (which is maintained for each congruence class in a cache).
A most recently used-update (MRU-update) operation typically occurs due to a cache hit. It adjusts the LRU state such that the “hit” member is ordered ahead of all other members in that congruence class, establishing the cache line in that member position as the most worthy member in the congruence class.
A least recently used-victim-selection (LRU-victim-selection) operation typically occurs when a cache miss requires that a member be allocated to hold a cache line arriving from elsewhere in the storage hierarchy. The operation determines which cache line is the least worthy of being retained in the congruence class, evicts that cache line, and places the newly arriving cache line in its member position.
Often, favorable operating characteristics and reduced complexity implementations for a cache can be achieved when the victim selection and state update portions of a cache allocation policy are tightly integrated with a common pipeline for accessing the cache arrays, directory arrays, and allocation policy (e.g. LRU) state arrays.
Further, in such implementations, further benefits are typically derived when the victim selection occurs as early as possible in the common pipeline, and when for each operational use of the pipeline, at most one cache allocation policy state update is performed.
Selection of Bad Victims:
(1) Unresolved/Unassigned Chronology State Bit Combinations
Various types of errors may occur while performing LRU victim selection from the cache. One error in particular occurs when, as with most conventional caching mechanisms, chronology vectors are utilized to select the LRU victim member. With the use of chronology vectors, an N bit vector yields 2N possible combinations of the N bits. For example, a 6 bit chronology vector (ordering cache members ABCD) provides 64 possible combinations. However, only a subset of the total number of vector combinations is actually valid. In the 6 bit chronology vector example, only 24 of the 64 combinations are actually valid combinations for ordering cache members ABCD.
The list of possible permutations with the 6 bits and indication of the valid permutations for victim selection are illustrated by the table of FIG. 11. As shown therein, a total of 32 correct states are provided and 32 error states. In actuality, there are only 24 correct encodings/states for LRU victim selection and 40 error states. The other 8 states labeled as “correct” states refer to non-LRU victims, i.e. referring to one of the other 3 members that are not actually the LRU member.
While the chronology vectors (LRU state bits) are stored within the LRU state array, one or more of the LRU state bits may be flipped (i.e., value changed from 1 to 0 or vice versa), such that the resulting combination of bits does not yield one of the 24 valid permutations (i.e., the chronology vector does not point to one of the members of the congruence set) or the resulting combination points to a deleted member (i.e., a member in the D-state, as described below). This flipping of the bit within the array may be caused by an alpha particle hitting the array, for example. When this invalid/unassigned combination is fed into the conventional LRU victim selection process, an 8-bit null output vector (i.e., all 0s) is provided from the LRU victim selection logic. This null output causes the victim selection mechanism to break down.
(2) D-State Members
As microprocessor chip fabrication technology advances toward smaller and smaller feature sizes, defect tolerance becomes more and more of a primary concern. Occasionally, the physical structure of chip at which a cache line is located becomes corrupted and is not able to be allocated to an incoming cache line.
One method for tolerating defects in these cells is to identify cache line compartments in the cache that have manufacturing defects, and mark those compartments as “deleted”, so they will not be used, and hence, will not introduce errors into the data that would have been stored therein. One technique for marking compartments as “deleted” is to define a cache state (which is called “D”, meaning deleted) that will be stored in the cache directory entry corresponding to a given defective compartment. Unlike normal cache states, such as those included in standard MESI or similar protocols, which describe the coherence attributes of the cache line contained in a given compartment, the D-state indicates that any data contained in the compartment is invalid, and further indicates to the cache replacement policy logic that the compartment is unavailable for allocation.
During typical LRU victim allocation, however, cache lines in the D state are still represented within the LRU state array and may easily be selected as the LRU victim since the line is not being used and thus appears to be stale (or LRU). However, selection of a Deleted line causes a fault condition at the cache and may result in a crash of the entire processing system.
A few methods/mechanisms have therefore been proposed to prevent the selection of a line in the D state during LRU victim selection. However, most of these techniques do not directly address or correct the selection of a Deleted member as the victim, where the selection is due to errors resulting from the chronology bits within the LRU state array being flipped to point to the Deleted member.
Selection of either an unassigned combination of LRU state bits or a member in the deleted state are referred to as bad victim selection, which is an undesirable condition. When a bad victim is selected, an error state is registered, and the system records a fault, which may be fatal and cause the system to crash. Therefore, it would be advantageous to have an improved method, apparatus, and computer for effectively handling selection of a bad victim during the victim selection process at the cache.