1. Field of the Invention
The present invention relates to caches in computer processors that have short cycles for faster performance. More specifically, the present invention relates to reading and writing a cache line in a computer processor and generating and checking parity to verify data integrity for the cache.
2. Description of Related Art
Computer processors are being designed and re-designed by research and development teams to process instructions faster. Computer processors perform tasks by executing a series of instructions that are supplied from a memory source. Thus, faster instruction processing generally means higher performance. Clock cycles are used to define boundaries for instruction execution. One way to increase performance is to reduce the period of each clock cycle so that the computer processes instructions at a higher rate of speed. However, shortening the clock period is not always achievable because limits imposed by microprocessor fabrication technology require a minimum time period for many operations. For example, 18 nanoseconds (ns) may be the minimum time necessary for the hardware to execute a common instruction for a given technology. If the clock period is shortened from 20 ns to 10 ns, then two clock cycles will be required to execute the 18 ns instruction instead of one, and no time savings will have been realized. Thus, reduction of the clock period is advantageous only if the instructions can fit within shorter time constraints.
Another way to increase performance is to reduce the number of clock cycles necessary to execute common computer operations. Memory accesses, which can consume three or more clock cycles, are common operations. In a memory access, an instruction may instruct the processor to read data from memory, or to store data in memory. If the processor executes instructions faster than it accesses memory, then memory access times could substantially delay computer operation, because often the processor must stall other operations while waiting to receive the data. Furthermore, because instructions are stored in memory, computer operation will be delayed if memory access times are longer than the average number of clocks per instruction. In order to reduce the time of memory access, a "cache" may be utilized to store and supply often used instructions and data. In most caches, one or two clock cycles is the maximum time necessary to retrieve data from a cache, in comparison to three or more cycles to retrieve data from memory. If the processor is faster than memory, for example if one instruction is executed per clock, substantial time savings and large increases in performance can result from use of a cache that can perform one cache access per clock.
Caches are organized in "lines". A cache may include hundreds of cache lines, each line including a selected block of memory which may be many bytes in length. There are many types of caches. In a fully associative cache, data can be stored in any cache line, regardless of its address. In a set associative cache, the cache lines are organized into "sets". Each set is assigned to hold data that has common lower address bits (the set address), and the cache lines in a particular set can hold data only if the lower bits match the set address. Because the set address uses the lower bits of an address, a long block of data can be stored in a series of sets. This is advantageous because data is usually read or written sequentially from a large block of memory. There are further advantages to a set associative cache. In a set associative cache, searching for a data match is simplified because the cache lines from only one set need be checked.
Each cache line is divided into fields that include a tag field indicative of the upper portion of address of the memory block, and a data field that stores the data at the memory location specified by the tag field. An exemplary address to access a cache includes a tag field indicative of the upper portion of address of the memory block, a set field indicative of the lower portion of the address, and a byte offset field to define the byte to be taken from the data. If a memory access occurs at a predetermined address, then the computer usually looks first to the cache to determine if a match (i.e., a "hit") can be found. If a hit occurs during execution of a read operation, then the data can be read from the cache line in the same cycle without the time-consuming memory access. During a write operation, the data is written to the cache line and the upper address is stored in the tag.
Often, it is desirable to verify the integrity of information stored in the cache, to guard against the small but distinct possibility that the stored data may have been altered in some way. Parity may be used for this purpose. The "parity" of computer data is defined by the number of set bits in a binary representation of the data. If the data has an even number of set bits, then an "even parity" results. But if the data has an odd number of set bits, then the data has an "odd parity". A "parity bit" is usually appended to the computer data to provide a preselected parity. For example, if the parity is predetermined to be "even" for each line of computer data in the cache, then the parity bit gives the data an even parity by either setting or clearing the parity bit according to the number of set bits in the data.
Parity checks are useful for both stored data (including instructions) and tags in a cache. If the stored data and tags are housed in separate arrays in the cache, then a location in the respective array is usually reserved for the parity bits, so that the data parity bit is stored together with the data in a data array, and the tag parity bit is stored together with the tag in a tag array. During a write to the cache, this configuration can slow cache operation because, although the data and the tag are available before the parity information, they cannot be written until after the parity information is calculated and becomes available. Parity information is not data, and provides no benefit other than data verification. It would be an advantage to provide a cache that allows immediate writing of the data and tag to its respective arrays, while still providing the advantages of parity verification.
It is advantageous if only one cycle is consumed by cache operations including a read and a write, and the associated parity checking. This is particularly advantageous if the processor speed is one clock per instruction. It would be advantageous to provide a cache and a method for performing cache operations that requires only one clock cycle per cache operation, and yet fits within the constraints of a short clock cycle of high speed computer processors.