This invention relates generally to data storage systems for computers, and more particularly to a solid-state disk memory storage system.
As it is known in the art computer systems generally use several types of memory systems. For example, computer systems generally use so called main memory comprised of semiconductor devices typically having the attribute that the devices can be randomly written to and read from with comparable and very fast access times and thus are commonly referred to as random access memories. However, since semiconductor memories are relatively expensive, other higher density and lower cost memories are often used. For example, other memory systems include magnetic disk storage systems. With magnetic disk storage systems generally access times are in the order of tens of milliseconds whereas main memory the access times are in the order of hundreds of nanoseconds. Disk storage is used to store large quantities of data which can be sequentially read into main memory as needed. Another type of disk like storage is solid state disk storage. Solid state disk storage is comprised of solid state memories memory devices which are accessed as though they were magnetic disks. One advantage over magnetic disk however is that they are much faster than magnetic disk memory systems. So called solid state disks memory systems are also more expensive than magnetic disk devices.
As it is also known in the art, memory systems including the aforementioned solid state disk memory storage system traditionally use storage devices with no "hard" faults. That is, the system is constructed to account for temporary or "soft" errors, but if the storage device has a hard or permanent fault, the device is replaced, or addressing is configured to map around the hard or permanent fault location. Since dynamic memory devices, DRAMs, are susceptible to radiation-induced soft failures, simple error detection and correction techniques are often used to withstand single-bit per word errors. Such non-permanent failures can be "scrubbed" by writing the corrected data back into the memory location. In such a memory system, the typical failure mechanism is the combination of a hard error and a soft failure in the same memory location.
In the DRAM manufacturing process, any given lot yields some percentage, typically large but less than 100%, of devices with no defects. The bulk of the failing devices have only a single storage location that is defective, although many other failures that affect more than one location are possible. The failing devices are typically sent to a crasher and destroyed, although some uses for devices in digital audio have developed, i.e., so-called Audio RAMs.
Several methods have been proposed to allow the use of DRAMs having defective bits. For example, in U.S. Pat. No. 4,992,984, a method of identifying and mapping around bad locations is disclosed. The overhead imposed by the need for finding the faults and then translating addresses for mapping around faulty locations in a memory results in lower performance in the overall system.
A goal of semiconductor integrated circuit designers for many years has been so-called "wafer scale integration." Enormous economies could result from using an entire wafer in undivided form, rather than scribing and breaking into individual chips, then packaging the chips. However, the likelihood of an entire wafer having no defective cells is minuscule, so some mechanism is needed to account for defective parts of the wafer. Usually, the method of accounting for defects is to locate the defective cells by testing, then mapping around the defects. That is, the addresses of the defective locations are stored in the memory system, as in a section of non-volatile memory, and attempts to access these locations are mapped to spare locations.
A technique of this type was disclosed by Anamartic, Ltd. for dealing with wafer-scale memories. The technique is to break the memory into subsections, called "tiles," where the size of a tile is a tradeoff between overhead circuitry and the amount of storage lost in the case of a memory fault (typical sizes are 8 KB to 64 KB ). The memory system includes sufficient spare tiles so that a given capacity can be guaranteed in the face of errors. Each tile contains logic that can be used to disable that tile if a fault is found in the tile. This configuration logic is set up on initialization from a map that is stored in a non-volatile media such as magnetic disk or EEPROM.
Error correcting codes have been used in memory sub-systems to enhance the reliability of memory. That is, ostensibly perfect memory elements have associated failure rates (usually soft errors caused by alpha particle hits in DRAMs) which could cause the memory system to return faulty data, if it were not for the error correcting code. The Hamming code single bit error correction normally used in memory has the virtue that it is not too complex to implement in a way that minimizes its impact on memory performance, while substantially improving the main memory reliability.
Often in memory systems complex error correcting techniques are used. In particular so called non-binary block codes such as a Reed Solomon code is used. With these codes data are grouped in "symbols" which are data from several data words such as the i.sup.th bit of N words. Data formatters are used to convert from data word format to symbol format. One problem is that such formatters are not generally configurable. With solid state disk it often occurs that different DRAM's are used in a disk.