The present invention relates to data processing systems having cache memory units, and, more particularly, to the operation of a write request in a microprocessor unit with a memory hierarchy.
Microprocessor units and systems that use microprocessor units have attained wide-spread use throughout many industries. A goal of any microprocessor system is to process information quickly. One technique that increases the speed with which the microprocessor system processes information is to provide the microprocessor system with an architecture which includes a fast local memory called a cache.
A cache is used by the microprocessor system to store temporarily instructions and data. A cache that stores both instructions and data is referred to as a unified cache; a cache that stores only instructions is an instruction cache and a cache that stores only data is a data cache. Providing a microprocessor architecture with a unified instruction and data cache or with an instruction cache and a data cache is a matter of design choice. Both data and instructions are represented by data signal groups. In the following discussion, both instruction signal groups and data signal groups will be referred to as data groups or simply as data. More specifically, data in memory units are stored in data lines, each data line having a plurality of sets of data. The sets of data can be bytes (8 bits) of logic signals or words (16 bits) of data.
A number of characteristics describe the function and operation of the cache. These characteristics include where a block of data can be placed in the cache, how a block of data can be found or accessed when the block is in the cache, which block of data should be replaced on a cache miss, and what happens when a write operation that stores data in a cache.
Three categories describe where a block of data (i.e., line of data) can be stored in a cache; a fully associative cache, a set associative cache, and a direct mapped cache. In a fully associative cache, a block of data can be stored anywhere in the cache. In a set associative cache, a block of data can be placed in a restricted set of places in the cache. In the cache architecture, the set is a group of two or more blocks in the cache. A block of data is mapped onto a set and then the block can be stored anywhere within the set. When there are n blocks in a set, the cache is referred to as an n-way set associative cache. With a direct mapped cache, each block of data has only one place in which it may be stored. The mapping in a direct mapped cache is usually related to the address of the block frame. With a direct mapped cache, each block of data has only one place in which it may be stored in the cache. The mapping in a direct mapped cache is usually related to the address of the block frame.
In a set associative cache, a block of data can be placed in a restricted set of places in the cache. (A set is a group of two or more blocks in the cache.) A block of data is mapped onto a set and then the block can be stored anywhere within the set. When there are n blocks in a set, the cache is referred to as an n-way set associative cache. In a set associative design, the address is split into the equivalent of a prefix and a suffix at a location determined by the size and architecture of the cache. The set-associative approach takes advantage of this spatial locality by placing sequential instructions, not at entirely random cache locations, but at sequential locations. A set-associative cache is constrained; the lower order address bits of the cache location must match the lower order address bits of the matching main memory address.
Those familiar with the design of a cache will be familiar with the organization of a cache in which each storage location has a tag field associated therewith and stored in the cache. Each data group has at least an address associated therewith. A portion of the address determines a location of an associated data group and the remaining portion of the address is contained, along with other data fields, in the tag field. Typically, an address of a required data group, when applied to a cache accesses a memory location storing the data and a memory location storing a tag field the locations being determined by a first portion of the applied address. The tag field is compared with a second portion of the applied address and when the two fields are the same, the data in the accessed location is the data associated with the applied address (i.e., commonly referred to as a cache hit). When a cache hit occurs, the cache can supply a copy of the data in the main memory location.
The tag field can include fields that determine whether the data associated with the address (i.e., both the first and second portions) is valid. The tag field can also indicate whether the data associated with the complete address has been modified, a modified data group frequently being referred to as a xe2x80x9cdirtyxe2x80x9d data group or a xe2x80x9cdirty bitxe2x80x9d. In addition, the data stored in the cache typically has a plurality of data sequential groups increments, e.g., a plurality of words, for a given address. For purposes of retrieving data from a cache, the plurality of neighboring data groups, also referred to as blocks renders the lowest order address bit(s) redundant.
A cache replacement algorithm determines which data block or data line to replace on a cache write. In a direct mapped cache, there is no need for a replacement algorithm, only one block location (or address) is checked for a hit and only the associated data block is replaced. However, for a fully associative or a set associative cache, the replacement algorithm selects which data block to replace. Two examples of cache replacement algorithms include a random replacement algorithm and a least recently used (LRU) replacement algorithm. With the random replacement algorithm, locations for replacement are randomly selected. With a least recently used replacement algorithm, the block location that is replaced is the block location that has been unused for the longest time.
A cache write policy determines what happens when a write operation occurs to the cache. Two basic options are available when writing to a cache, a write through operation and a write back operation. In a write through operation, the information is written to the block location of the cache as well as to the block location in the lower level memory, typically main memory. In a write or write back operation, the information is written only to the block in the cache. The modified cache block is written to the lower level memory only when the block is replaced. Write back cache blocks are clean or dirty depending on whether the information in the cache differs from that in the lower level memory. To reduce the frequency of writing back blocks on replacement, the dirty bit, which indicates whether or not the block was modified while in the cache, is provided.
As microprocessor architectures have matured, microprocessor systems have been designed that include both a primary (or internal) cache and a secondary (or external) cache. The primary cache is also referred to as a level 1 (L1) cache, while the secondary cache is also referred to as a level 2 (L2) cache. In some microprocessor systems, the microprocessor might include two internal caches which are referred to as a level 0 (L0) cache and a level 1 cache. It should be appreciated that the terminology of the cache designations are meant to refer to a cache level and not necessarily to the cache location. Cache designations may be generalized as a level i cache, a level i+1 cache, a level i+2 cache, etc. The levels denote, in general, the accessibility of the data stored in the cache, the lower the level, the more readily accessible are the blocks stored therein to the microprocessor. A plurality of caches necessarily involves a series of design considerations. For example, each level of memory requires an increasing amount of time to access, the primary (i.e., L0 or L1) cache can typically designed to be accessible in 1 central processing unit cycle, while the main memory can be typically designed to be accessed in times of the order of 100 central processing unit cycles.
A consideration of multilevel cache systems is whether all data in a lower level cache is always included in the next higher level cache. If so, the lower level cache is said to be inclusive. Inclusion allows consistency between the caches to be determined merely by checking the lower level cache. Another consideration of multilevel cache systems is whether to provide the second level cache with higher associativity than the higher level cache.
Whatever cache strategies are chosen, the interaction between the microprocessor and the cache(s) can determine the performance of the data processing system. A need has been felt for a memory hierarchy, which improves the performance of the data processing system. In particular, a write instruction should have the feature or improving the performance of the data processing system.
The aforementioned and other features are accomplished, according to the present invention, by providing a data processing system having a primary cache and at least a secondary cache memory unit, the cache memory units being non-inclusive as contrasted with inclusive cache memory units. In addition, the tag field in the cache includes a dirty (i.e., modified) bit field. In a write request operation, when the address of the data that is the subject of the write request is found in the primary cache, the data is placed in the data line defined by the address associated with the write operation and the dirty bit is set. When the address associated with the write operation is not found in the primary cache, a determination is made whether an address is available for storing the data line that includes the address of the write request data. When a data line address is available in the primary cache, the procedure examines the lower level caches in order until the primary data line is found. When the primary data line is found, the data address of the cache memory unit in which the data line is found is invalidated and the data line itself is placed in the available cache line address in the primary cache. When the data line address is not found in the cache memory unit, then the data line is retrieved from main memory and stored in the primary cache and the dirty bit is set. When an address is not available for the storage of the write request data, a data line address, referred to as a write back address is selected according to a replacement algorithm. The write back address is then sought in the lower level caches and in main memory. The write back line is then stored in the first available location. The write request data is stored in the primary cache address that has been selected for replacement and that stores the data line associated with that address.