The present invention relates to methods and apparatus for controlling a cache memory and, more particularly, to a control technique where storage of floating point data into the cache memory is prohibited when such storage would overwrite valid, integer data.
In recent years, there has been an insatiable desire for faster computer processing data throughputs because cutting-edge computer applications are becoming more and more complex, and are placing ever increasing demands on microprocessing systems. Conventional microprocessing systems (which employ a microprocessor and an associated memory) have very rapid cycle times (i.e., the unit of time in which a microprocessor is capable of manipulating data), such as one nanosecond. The time required to access data stored in main memory, however, may be considerably longer than the cycle time of the microprocessor. For example, the access time required to obtain a byte of data from a main memory (implemented utilizing dynamic random access memory, DRAM, technology) is on the order of about 60 nanoseconds.
In order to ameliorate the bottleneck imposed by the relatively long access time to DRAM memory, those skilled in the art have utilized cache memories. A cache memory augments the main memory in order to improve the throughput of the system. While the main memory is often implemented utilizing relatively inexpensive, slow, DRAM memory technology, the cache memory is typically implemented utilizing more expensive, fast, static random access memory (SRAM) technology. Given that the cache memory is implemented utilizing a high-cost technology, it is usually of a much smaller size than the main memory.
The cache memory may be disposed xe2x80x9con-chipxe2x80x9d with the microprocessor, which is called a level-one (L1) cache memory, or it can be disposed separate, or off-chip, from the microprocessor, which is called a level-two (L2) cache memory. L1 cache memories usually have a much faster access time than L2 cache memories. A combined L1, L2 cache memory system also may be formed where both an on-chip cache memory and an off-chip cache memory are employed.
Due to the relatively small size of cache memories, conventional algorithms have been employed to determine what data should be stored in the cache memory at various times during the operation of the microproessing system. These conventional algorithms may be based on, for example, the theoretical concept of xe2x80x9clocality of reference,xe2x80x9d which takes advantage of the fact that relatively small portions of an executable program are used by the microprocessor at any particular point in time. Thus, in accordance with the concept of locality of reference, only small portions of the executable program are stored in cache memory at any particular point in time. These or other algorithms may also be employed to control the storage and retrieval of data (which may be used by the executable program) in the cache memory.
The particularities of the known algorithms for taking advantage of locality of reference, or any other concept, for controlling the storage of executable programs and/or data in a cache memory are too numerous to present in this description. Suffice it to say, however, that any given algorithm may not be suitable in all applications as the data processing goals of various applications may differ significantly.
In some cases, a microprocessor system employing a cache memory may be required to process both integer and floating point data. Applications employing such floating point data may require very large floating point data arrays, which are much larger than the size capabilities of an L1 cache memory and which have very low address locality. Loading floating point data from such a large array may pollute the L1 cache memory, particularly when the L1 cache memory includes integer data, such as address pointers and the like. Conventional methods and apparatus that are designed to avoid corruption of the integer data of the L1 cache memory dictate that the L1 cache memory is not accessed when loading floating point data; rather, an L2 cache memory (or main memory) is directly accessed. Further details regarding this type of control may be found in U.S. Pat. No. 5,510,934, the entire disclosure of which is hereby incorporated by reference in its entirety.
Unfortunately, reduced access to the L1 cache memory as dictated by these conventional control techniques results in an overall lower throughput for the microprocessing system. Indeed, use of the very high speed of the L1 cache memory is not optimized and, in fact, such use is entirely avoided in favor of the slower L2 cache memory when loading floating point data.
Accordingly, there are needs in the art for new methods and apparatus for controlling a cache memory, which may include an L1 cache memory, an L2 cache memory and/or a combination thereof, in order to improve memory efficiency, increase processing throughput and improve the quality of the overall data processing performed by the system.
When a memory access request for floating point data cannot be satisfied by accessing the cache memory, i.e, when a cache miss occurs, it is desirable to execute a data refill sequence in which the floating point data is obtained from main memory and stored in the cache memory. When an L1 cache memory is employed and a memory access request for floating point data is made, it would be desirable to first access the L1 cache memory to satisfy the request and, if it cannot be satisfied, accessing an L2 cache memory or main memory to refill the L1 cache memory. In any case, when the microprocessing system is operating on both integer and floating point data, it is desirable to avoid overwriting valid integer data with floating point data in the L1 cache memory.
To this end, in accordance with one or more aspects of the present invention, a method for controlling a cache memory includes: receiving an address for at least one of storing data into and retrieving data from the cache memory, the address including tag bits and index bits; accessing one or more cache lines of the cache memory corresponding to the index bits of the address, each cache line including an address tag, a data valid flag, and a data type flag; determining whether data of at least one of the one or more cache lines is valid based on the data valid flag; determining what type of data have been stored in the at least one cache line based on the data type flag; and prohibiting overwriting floating point data into the at least one cache line when the data therein are valid, and the data that have been stored in the at least one cache line are integer data.
Preferably the method further includes: (i) setting the valid flag for the at least one cache line to indicate that the data therein are valid; and (ii) setting the data type flag for the at least one cache line to indicate that the data in the at least one cache line are of an integer type, when the overwriting of floating point data into the at least one cache line is prohibited.
It is noted that the data valid flag and/or the data type flag may be one-bit flags.
Preferably, the method further includes: permitting overwriting integer data into the at least one cache line when the data valid flag indicates that the at least one cache line does not contain valid data; setting the valid flag for the at least one cache line to indicate that the integer data are valid; and setting the data type flag for the at least one cache line to indicate that the integer data are of an integer type.
Alternatively, or in addition, the method may further include: permitting overwriting floating point data into the at least one cache line when the data valid flag indicates that the at least one cache line does not contain valid data; setting the valid flag for the at least one cache line to indicate that the floating point data are valid; and setting the data type flag for the at least one cache line to indicate that the floating point data are of a floating point type.
Alternatively, or in addition, the method preferably further includes permitting overwriting integer data into the at least one cache line when: (i) the data valid flag indicates that the at least one cache line contains valid data, and (ii) the data type flag indicates that the data of the at least one cache line are of an integer type. In this case, the method preferably further includes: setting the valid flag for the at least one cache line to indicate that the integer data are valid; and setting the data type flag for the at least one cache line to indicate that the integer data are of an integer type.
Alternatively, or in addition, the method preferably further includes permitting overwriting integer data into the at least one cache line when: (i) the data valid flag indicates that the at least one cache line contains valid data, and (ii) the data type flag indicates that the data of the at least one cache line are of a floating point type. In this case, the method preferably further includes: setting the valid flag for the at least one cache line to indicate that the integer data are valid; and setting the data type flag for the at least one cache line to indicate that the integer data are of an integer type.
Alternatively, or in addition to the above, the method preferably further includes permitting overwriting floating point data into the at least one cache line when: (i) the data valid flag indicates that the at least one cache line contains valid data, and (ii) the data type flag indicates that the data of the at least one cache line are of a floating point type. In this case, the method preferably further includes: setting the valid flag for the at least one cache line to indicate that the floating point data are valid; and setting the data type flag for the at least one cache line to indicate that the floating point data are of a floating point type.
When the cache memory is implemented utilizing N-way set associative technology, in which many cache lines may be invoked by a single memory access request for a piece of data, and in accordance with one or more further aspects of the present invention, the method preferably further includes: accessing all cache lines of the cache memory corresponding to the index bits of the address; determining which of the cache lines is valid based on the data valid flag; and permitting overwriting floating point data or integer data into any of the cache lines in which the corresponding data valid flag indicates that such cache lines do not contain valid data.
Alternatively, or in addition to the above, the method preferably further includes: accessing all cache lines of the cache memory corresponding to the index bits of the address; determining which of the cache lines is valid based on the data valid flag; and permitting overwriting integer data into any of the cache lines when all of the data valid flags indicate that the cache lines contain valid data.
The method preferably further includes: determining what type of data have been stored in each of the cache lines based on the respective data type flags; and permitting overwriting floating point data into any of the cache lines in which the corresponding data type flags indicate that the type of data that have been stored in such cache lines is floating point data when all of the data valid flags indicate that the cache lines contain valid data.
In addition, the method preferably further includes prohibiting overwriting floating point data into any of the cache lines when all of the data valid flags indicate that the cache lines contain valid data and all of the data type flags indicate that the cache lines contain integer data.
In accordance with one or more further aspects of the present invention, when the overwriting of the floating point data into any of the cache lines is prohibited, the method preferably further includes transferring the floating point data to a load/store unit or another cache memory. The other cache memory may include only one cache line.
In accordance with one or more further aspects of the present invention, the methods for controlling a cache memory described thus far, and/or described later in this document, may be achieved utilizing suitable hardware, such as that shown in the drawings hereinbelow. Such hardware may be implemented utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, one or more programmable digital devices or systems, such as programmable read only memories (PROMS), programmable array logic devices (PALS), any combination of the above, etc.