1. Technical Field
Embodiments generally relate to providing memory sparing information—i.e. information for use in implementing memory sparing. More particularly, certain embodiments provide techniques for storing memory sparing information in a memory controller, wherein the memory sparing information is specific to a line of data of a memory controlled by that memory controller.
2. Background Art
In existing memory sparing techniques, a memory controller which controls multiple ranks of random access memory (RAM) will designate one of the memory ranks to operate as an alternate for another of the memory ranks. Typically, a rank which is to be made available for operation as a spare rank is designated ahead of time when the memory system is initialized and configured. The designated rank of memory may subsequently operate as a spare rank for another rank—e.g. in response to that other rank being identified as having failed to satisfy some performance criteria.
For example, when a memory controller accesses a first memory rank to read a given chunk of data, one or more check bits in an associated check byte may also be accessed to determine the validity of the data chunk. If the chunk of data is not too corrupt, the memory controller may resort to using data in the check byte to reconstruct the data in the chunk. This reconstruction results in a corrected error correction code (ECC) error,
For the purposes of memory sparing, a memory controller may keep track of the number of corrected ECC errors which are corrected over time for a given memory rank. Some spike or other trend upwards in the number of corrected FCC errors may be detected by the memory controller as being predictive of that rank of memory returning excessively corrupted, even uncorrectable, data in future.
Currently, memory sparing is a preventative measure so that when such a spike/trend indicates this unreliability of a first memory rank in future, a second memory rank is used as a spare rank to substitute for that first memory rank. Such memory sparing techniques only provide very gross, low-resolution sparing. More particularly, memory sparing is done on a rank level, where one or more memory ranks are allocated collectively to be the backup spare memory operating as a substitute for another one or more memory ranks which have failed, or are predicted to fail. However, the memory rank(s) which has been classified as failing or predicted to fail nevertheless may have comparatively few individual failed bits, and indeed may include billions of bits which are operative.
Typically, current memory sparing techniques set aside one or more unused memory ranks for the possibility of their eventual use as spare memory ranks for other memory ranks which are in use. In such techniques, memory controllers generally implement a “track switch”—e.g. to switch from using memory rank X to using (previously unused memory rank Y, from—using dual in-line memory module (DIMM) A to using previously unused DIMM B, etc. This unused memory has to be allocated memory space and power, which is a source of resource waste,