1. Technical Field
The invention relates generally to the field of dynamic random access memory (DRAM) design, and more particularly to a DRAM architecture that optimizes the combination of on-chip error correction code (ECC) circuitry, bit line redundancy, and word line redundancy, so as to optimize the ability of the DRAM to correct different types of errors.
2. Background Art
From the very early stages of DRAM development in the 1970's, designers have recognized the need for some sort of on-chip error recovery circuitry. That is, given the large number of processing steps needed to make a memory chip, and given the large number of discrete transistor-capacitor memory cells to be fabricated, from a practical standpoint it is inevitable that at least some memory cells will not function properly.
One of the first on-chip error recovery techniques utilized in the industry was the general idea of redundancy. In redundancy, one or more spare lines of cells are added to the chip. These can be either spare word lines (i.e. lines of cells having their FET gate electrodes interconnected) or spare bit lines (i.e. lines of cells having their FET drain electrodes interconnected on a common line coupled to a sense amplifier that senses the state of the selected memory cell). Typically, a standard NOR address decoder is provided for each redundant line. After the memory chip is manufactured, it is tested to determine the addresses of faulty memory cells. These addresses are programmed into the address decoder for the redundant lines, by controllably blowing fuses, setting the state of a RAM or EEPROM, etc. When the address sent to the memory chip is for the line on which the faulty cell resides, the address decoder for the redundant line activates the redundant line instead In this manner, if discrete cells in the memory chip are inoperative, redundant cells can be substituted for them. Among the earliest patents directed to redundancy are U.S. Pat. No. 3,753,244, entitled "Yield Enhancement Redundancy Technique," issued Aug. 28, 1973 to Sumilas et al and assigned to IBM (word line redundancy), and U.S. Pat. No. 3,755,791, entitled "Memory System With Temporary or Permanent Substitution of Cells For Defective Cells," issued Aug. 28, 1973 to Arzubi and assigned to IBM (bit line redundancy).
One of the drawbacks associated with redundancy is that it can only rectify a relatively small amount of faulty random cells. That is, as the number of faulty cells increases, the number of redundant lines needed to correct these cells increases, to the point where you have a large amount of spare memory capacity that ordinarily is not used (and may itself incorporate faulty cells, such that you need even more redundant lines to correct errors in the remaining redundant lines). Therefore, typically a relatively small amount of redundant lines are provided on-chip, such that if an entire subarray or array of cells is faulty, redundancy can no longer be used for correction.
This problem is addressed by the use of partially-good chips. Two or more chips having large amounts of faulty cells are mounted and stacked together in a multi-chip package. In one technique, the chips are selected such that they complement one another in terms of which arrays are good and which arrays are faulty. For example, if a given array on a first memory chip is bad, a second chip is selected wherein that same array is good. Thus, the two partially-good chips operate as one all-good chip. See U.S. Pat. No. 3,714,637, entitled "Monolithic Memory Utilizing Defective Storage Cells"; U.S. Pat. No. 3,735,368, entitled "Full Capacity Monolithic Memory Utilizing Defective Storage Cells"; and U.S. Pat. No. 3,781,826, "Monolithic memory utilizing Defective Storage Cells", all issued to W. Beausoleil and assigned to IBM.
Over time, some workers in the art have come to understand that the error recovery techniques discussed may not efficiently rectify all of the possible errors that may occur during DRAM operation. Specifically, a memory cell that initially operates properly may operate improperly once it is in use in the field. This may be either a so-called "soft error" (e.g. a loss of stored charge due to an alpha particle radiated by the materials within which the memory chip is packaged) or a "hard error" (a cycle-induced failure in the metallization or other material in the chip that occurs after prolonged use in the field). Because both of these types of errors occur after initial testing, they cannot be corrected by redundancy or by the use of partially-good chips. In general, this problem has been addressed by the use of error correction codes (ECC) such as Hamming codes or horizontal-vertical (HV) parity. These techniques are typically used in larger computer systems wherein data is read out in the form of multi-bit words.
The Hamming ECC double error detect, single error correct (DED/SEC) system of the prior art will now be briefly described. The data is stored as an ECC word having both data bits and check bits. The check bits indicate the correct logic states of the associated data bits. The ECC logic tests the data bits using the check bits, to generate syndrome bits indicating which bits in the ECC word are faulty. Using the syndrome bits, the ECC logic then corrects the faulty bit, and the ECC word as corrected is sent on to the processor for further handling.
As previously stated, in the prior art ECC circuitry was typically used in large systems and embodied in separate functional cards, etc. While this type of system-level ECC is now being used in smaller systems, it still adds a degree of both logic complexity and expense (due to added circuit cost and decreased data access speed) that makes it infeasible for less complicated systems. In these applications, memory performance/reliability suffers because there is no system-level ECC to correct for errors that occur after initial test.
The solution to this problem is to incorporate ECC circuitry on the memory chip itself. This reduces the expense associated with ECC, while at the same time increasing the effective memory performance. U.S. Pat. No. 4,335,459, entitled "Single Chip Random Access Memory With Increased Yield and Reliability," issued 6/15/82 to Miller, relates to the general idea of incorporating Hamming code ECC on a memory chip. The stored data is read out in ECC words consisting of 12 bits (8 data bits, 4 check bits) that are processed by the ECC circuitry. The corrected 8 data bits are sent to an 8-bit register. The register receives address signals that select one of the 8 bits for output through a single bit I/O. U.S. Pat. No. 4,817,052, entitled "Semiconductor memory With An Improved Dummy Cell Arrangement And With A Built-In Error Correcting Code Circuit," issued 3/28/89 to Shinoda et al and assigned to Hitachi, discloses a particular dummy cell configuration as well as the general idea of interdigitating the word lines so that adjacent failing cells on a word line will appear as singlebit fails (and thus be correctable) by the ECC system, because they will appear in different ECC words.
Yet other workers have recognized that the optimum solution to error correction is to incorporate both ECC circuitry and redundancy on the same memory chip. Examples of such arrangements include U.S. Pat. No. 4,688,219, entitled "Semiconductor Memory Device Having Redundant Memory and Parity Capabilities," issued Aug. 18, 1987 to Takemae and assigned to Fujitsu (bit line redundancy incorporated with HV parity by use of a switching circuit that generates the parity bits for the redundant column line separately from the generation of the parity bits for the remaining cells); U.S. Pat. No. 4,768,193, issued Aug. 30, 1987 to Takemae and assigned to Fujitsu (an array contiguous to the main memory array provides both word line and bit line redundancy for an HV ECC system, wherein fuses are used to disconnect the faulty word line and/or bit line from the horizontal and/or vertical parity generators, respectively); and an article by Furutani et al, "A Built-In Hamming Code ECC Circuit for DRAM's," IEEE Journal of Solid-State Circuits, Vol. 24, No. 1, Feb. 1989, pp. 50-56 (new ECC circuitry for an on-chip Hamming code system, with redundancy the article does not discuss redundancy in any detail).
In all of the above references, bit line and word line redundancy techniques are used that are not optimized for on-chip ECC. In the '219 Takemae patent, conventional bit line redundancy is used, with separate parity generation for the redundant line. In the '193 Takemae patent, a single array provides both bit line and word line redundancy. Since Furutani does not describe a redundancy system, it appears that he simply assumes that conventional redundancy can be used. This assumption is not incorrect; as shown by the Takemae patents, conventional redundancy techniques can be used. However, we have found that as a practical matter conventional redundancy will decrease the overall effectiveness of the total error correction system. For example, by having one array provide both bit and word redundancy, the error correction system itself becomes more susceptible to errors, because the redundant cells are physically all in one place. Moreover, the use of ideas such as fuses to physically disconnect the faulty main memory rows/columns from the ECC circuitry, and/or incorporating an entirely separate set of ECC circuitry for the redundant elements, adds extra logic to the design that takes up more room on the chip while adding yet another failure mechanism.
Also, none of these references take into account the use of ECC as a tool to aid in process learning during the early stages of design and development of a memory chip. Due to the complexity and uniqueness of the myriad of process steps that make up a given manufacturing process for a memory chip, when the chips are first being made (i.e., early in the production cycle) many different failure mechanisms are encountered. At this early stage, it is critical to produce some sort of working hardware that can be tested, so as to gain a greater understanding of these failure mechanisms. ECC can be used as a tool to gain a greater appreciation of these mechanisms, because it can be used to rectify a large quantity of errors, both hard and soft. However, later in the production cycle of the chip, sufficient process learning may occur such that the number of errors is greatly reduced. In this situation, it may be advisable to completely do away with the ECC system, so as to reduce the chip size and increase access speeds. In the prior art, no provision is made for designing the overall chip architecture such that the ECC system can be deleted from product chips without a major redesign of the support circuitry.
Accordingly, a need exists in the art for a memory chip architecture that incorporates redundancy (as well as other features) optimized for on-chip ECC. Moreover, there is a need in the art for a memory architecture that supports early process learning, without increasing expense while decreasing performance of memory chips made in production volumes.