This invention relates to computer systems having a DRAM memory and, more particularly, to computer systems having DRAM memories, wherein failed chips or chip sets or chip I/O can be replaced with good chips or chip sets OR I/O without having to replace the entire memory or reboot.
The amount of memory on a DIMM or SIMM card utilized in a computer system increases, and the number of chips which constitute or make-up the memory card also increases. With the increasing number of DRAM chips on a DIMM or SIMM card, there is an increased potential or possibility of failure of one or more of these chips. In order to obviate the need for discarding an entire memory card if a single chip fails, there have been proposals for so called xe2x80x9cChip Sparingxe2x80x9d in which one or more xe2x80x9csparexe2x80x9d chips are provided on the SIMM or DIMM card, i.e. a chip that is not normally used for data storage. When a chip fails, this xe2x80x9csparexe2x80x9d chip is then inserted into the system in place of the failed chip, sometimes referred to as chip kill, and the memory is allowed to continue. In the past, this chip replacement has been accomplished by completely shutting the system down and replacing the memory system with the spare chip switched into the memory system in place of the failed chip, which insertion has been done with the system powered down and the system then rebooted; or, in the alternative, with some more sophisticated systems, the memory card is switched off-line and reinitialized with the spare chip. The rebooting of the system or re-initialization of the memory card is time consuming and disruptive of continuous operation of a computer system.
According to the present invention, an improved chip sparing system and method of operation are provided in which a failed chip is detected even if there are multiple errors on a single chip and one or more spare chips are provided within the system; and in which spare chips or spare chip I/Os are dynamically inserted into the system upon detection of a failed chip or chip I/O without the necessity of shutting down and rebooting the system or even without the necessity of re-initializing the memory.