In most modern computer systems, a memory system may have a significant impact on the performance of the computer system. A poorly designed memory system may cause a modern computer system with a state of the art processing unit to perform no better than an older computer system with an average processing unit. This may be due to slow access to programs, applications, data, and so forth, stored in the memory system. A poorly designed memory system may significantly delay access to memory and effectively negate performance advantages inherent in the state of the art processing unit. For example, the speed in which the processing unit may process instructions and data may be dependent on the ability of the memory system to provide the instructions and data to the processing unit. A slow memory system may make the processing unit wait unnecessarily.
FIG. 1 is a diagram illustrating a computer system 100. Computer system 100 includes a processing unit 105 and a memory system 110. Memory system 110 may be partitioned into multiple levels, with a first level of memory being registers 115 located in a processing unit, a second level of memory being a cache memory 120, a third level of memory being main memory 125, and a fourth level of memory being mass storage devices 130. In general, the amount of the memory available at each level will increase with level, while the speed will decrease. Generally, the speed of memory at a given level may be an order of magnitude slower than memory at a level immediately higher than it. For example, registers 115 in the first level of memory may amount to only tens or hundreds of bytes, while a mass storage device 130 may be on the order of terabytes in size, but the speed of the registers 115 may be hundreds if not thousands of times faster than the mass storage device 130.
Cache memory 120 is a relatively small but fast memory that may be used to temporarily hold instructions and/or data. Since cache memory 120 is relatively small, it may be unlikely that an entirety of instructions and/or data can be stored in cache memory 120. Rather, the instructions and/or data stored in cache memory 120 may be instructions and/or data that are predicted to be instructions and/or data that will be used by processing unit 105 in a relatively short amount of time. If the instructions and/or data required by processing unit 105 are residing in cache memory 120, then the instructions and/or data may be retrieved from cache memory 120 with relatively small delay. However, if instructions and/or data required are not in cache memory 120, then the instructions and/or data will need to be retrieved from main memory 125 or mass storage device 130 with significant delay.
Cache memory 120 include a cache data memory 135 that may be used to store instructions and/or data, a cache tag memory 140 that may be used to store high order address bits of instructions and/or data stored in cache data memory 135, and a cache controller 145 that may be used to manage the reading and writing of instructions and/or data in cache memory 120, as well as predicting instructions and/or data that will be needed in the future and retrieving them from lower levels of memory.
When a memory address of an instruction or data is presented to cache memory 120, high order bits (the tag) of the memory address may be used to index into cache tag memory 140 to determine if the instruction or data is stored in cache data memory 135. If the tag indicates that the instruction or data is stored in cache data memory 135, then cache controller 145 may be used to retrieve the instruction or data from cache data memory 135. If the tag indicates that the instruction or data is not stored in cache data memory 135, then the instruction or data may be retrieved from main memory 125 or mass storage 130. As the instruction or data is retrieved from main memory 125 or mass storage 130, it may also be stored in cache data memory 135. Storing the instruction or data may force out instructions and/or data already stored in cache data memory 135. Additionally, cache tag memory 140 may also be updated to reflect the newly stored instruction and/or data.
Some computer systems also make use of error correcting codes (ECC) to protect the integrity of instructions and/or data stored in its memory system. The use of an ECC adds additional bits of information to each byte, word, and so on, in the memory system to detect (and in some cases, correct) errors in the instructions and/or data stored in the memory system. The addition of ECC may help to increase the complexity of determining cache memory hits (i.e., the requested instruction or data is stored in the cache memory) or cache memory misses (i.e., the requested instruction or data is not stored in the cache memory), since additional operations may need to be performed to verify that there is an ECC match in addition to memory address match. This may result in slowing down cache memory performance as well as increasing cache memory design complexity.
FIG. 2a is a diagram illustrating a portion of a cache memory 200 wherein cache memory 200 makes use of a prior art technique for determining cache memory hit or miss and wherein the memory system makes use of ECC. Cache memory 200 includes an ECC generator 205 that may be used to generate an ECC from an m bit input CPU-tag data input (i.e., high order bits of a memory address from the processing unit). The ECC, as generated by ECC generator 205, may be provided to cache tag memory 207 (which may be similar to cache tag memory 140, for example), where if the CPU-tag is in cache tag memory 207, the cache-tag associated with the CPU-tag may be retrieved.
The cache-tag and the ECC may then be provided to a single error correction-double error detection (SEC-DED) ECC unit 209 that may be used to correct the cache-tag using the ECC if necessary (and if possible). SEC-DED ECC unit 209 may produce a corrected cache-tag, which may then be compared with the CPU-tag in an equality comparator 211. If the corrected cache-tag and the CPU-tag are equal, then equality comparator 211 may assert that a cache memory hit has occurred. If the corrected cache-tag and the CPU-tag are not equal, then equality comparator 211 may assert that a cache memory miss has occurred.
FIG. 2b is a diagram illustrating a portion of a cache memory 250 wherein cache memory 250 makes use of a prior art technique for determining cache memory hit or miss and wherein the memory system makes use of ECC. Cache memory 250 attempts to shorten delays incurred by the addition of ECC by computing a fast hit value, which may or may not represent an actual cache memory hit. Cache memory 250 includes cache tag memory 207, SEC-DED ECC unit 209, and equality comparator 211 which serve substantially the same purpose in cache memory 250 as they do in cache memory 200, which is determining the occurrence of a cache memory hit or a cache memory miss. The result produced by these units is referred to as being a true hit.
However, to shorten delays due to the inclusion of the ECC, cache memory 250 also computes a fast hit. As discussed above, a fast hit may not accurately represent a cache memory hit. A memory address that results in a fast hit may actually end up being a cache memory miss, so a fast hit is not 100 percent fool proof. Cache memory 250 includes an ECC generator 255. Similar to ECC generator 205, ECC generator 255 generates an ECC from the CPU-tag from the memory address provided by the processing unit. ECC generator 255 outputs an ECC that may be compared with an ECC retrieved from cache tag memory 207. The comparison may be performed by an equality comparator 257. If the two ECCs are equal, then a fast hit is asserted. Since the delay through EEC generator 205 and equality comparator 257 is typically shorter than the delay through cache tag memory 207, SEC-DED ECC unit 209, and equality comparator 211, it will usually complete faster.
FIG. 2c is a diagram illustrating a detailed view of SEC-DED ECC unit 209. SEC-DED ECC unit 209 includes a SEC-DED syndrome unit 265 that may be used to compute an error syndrome from the cache-tag. The error syndrome may be a pattern that is representative of any errors present in the cache-tag. The error syndrome may be provided to an error bit decoder 267 that may be used to determine which bit (or bits) in the cache-tag contains an error. An error bit corrector 269 may then be used to correct the erroneous bit (bits) in the cache-tag.
The error syndrome from SEC-DED syndrome unit 265 may also be provided to a SED/DED decoder 271. SED/DED decoder 271 may determine from the error syndrome provided by SEC-DED syndrome unit 265 if a single error detection (SED) or a double error detection (DED) occurred. If a SED occurred, then error bit corrector 269 may be able to correct the error in the cache-tag. If a DED occurred, then error bit corrector 269 may not be able to correct the error in the cache-tag and a cache memory miss has occurred. The output of error bit corrector 269 may be compared with the CPU-tag in equality comparator 211 to determine if a cache memory hit or a cache memory miss occurred.