As the cost of computational hardware has decreased, computers with ever-larger memory systems have proliferated. Systems with hundreds of Mbytes are common, and systems with a few Gbytes of memory are commercially available. As the size of the memory increases, problems arising from bad memory cells become more common.
Memory failures may be divided into two categories, those resulting from bad memory cells that are detected at the time of manufacture and those that arise from cells that fail during the operation of the memory. At present, problems arising from defective memory cells that are detected during the manufacturing process are cured by replacing the bad cells. The typical memory array is divided into blocks. Each memory chip has a predetermined number of spare blocks fabricated thereon. If a block in the memory is found to have a defective memory cell, the block in question is disconnected from the appropriate bus and one of the spares is connected to the bus in its place. However, once the part is packaged, there is no means for replacing a block with a spare, since the replacement process requires hard wiring of the spares to the bus.
The cost of testing the memory chips is a significant factor in the cost of the chips. The rate at which memory cells can be tested is limited by the internal organization of the memory blocks and the speed of the buses that connect the memory blocks to the test equipment. The various buses are limited to speeds of a few hundred MHz. Data is typically written and read as blocks having 64 bits or less. Since a write operation followed by a read operation requires several clock cycles, the rate at which memory can be tested is limited to 100 million tests per second. Extensive testing requires each memory cell to be tested a large number of times under different conditions such as temperature and clock speed. Hence, a 1 Gbyte memory chip would require minutes, if not hours, to thoroughly test. The cost of such testing would be prohibitive; hence, prior art memory chip designs will not permit extensive testing at the 1 Gbyte level and beyond.
Even when the obviously bad memory blocks have been removed, sooner or later, the memory will fail because of the failure of one or more cells in a block. The probability that such a failure will cause a system failure depends on the lifetime of the system, the size of the memory, and the type of memory. The probability of such a failure increases with the lifetime of the system and the size of the memory. While system lifetimes are not increasing, the size of memory is increasing. Accordingly, more system failures are expected.
In addition, some types of memory cells have higher failure rates than others. For example, EEPROM and flash memories can only be written a relatively small number of times compared to conventional DRAM and static RAM memories. In the case of EEPROMs and flash memories, the limited number of write cycles imposes severe restrictions on the possible applications of these memories. Similarly, memories based on ferroelectrics have relatively small lifetimes relative to these conventional memories; however, the ferroelectric memories can be written many more times than EEPROMs and flash memories.
In principle, all of these types of memories would benefit by having some form of reconfiguration system built directly into the memory. Such a system would replace blocks of memory that fail during the operational life of the system, thereby extending the lifetime of the system. However, prior to a memory cell actually failing, there is often a period of time in which the memory cell operates, but with a high error rate. Such a memory cell can cause intermittent system failures and may be very difficult to diagnose. Hence, any form of block replacement system that depends on detecting the failure of a block may not be able to operate successfully.
Broadly, it is the object of the present invention to provide an improved memory system.
It is a further object of the present invention to provide a memory system that can be reconfigured after the parts have been packaged.
It is a still further object of the present invention to provide a memory system that can detect memory cells with high error rates and replace these cells prior to the error rates causing system failures.
These and other objects of the present invention will become apparent to those skilled in the art from the following detailed description of the invention and the accompanying drawings.