Memory cell defects and memory array defects have many sources and, as a result, many signatures. While single, isolated cell failures may be spread throughout the array, very often, multiple cells in the same vicinity fail. When multi-cell failures occur, the failure may be characterized as a word line failure, (i.e. failing cells with the same word line address), a bit (or column) line failure (i.e. failing cells with the same bit address), or both. The sources of these multi-cell failures vary. In particular, bit line failures may be caused by open bit lines, shorted bit lines, missing field oxide, excess oxide, intercell leakage, or for various other reasons. Consequently, memory arrays are tested extensively to identify defective cells.
Very often, chips with defective cells can be repaired. Once identified, defective cells can be replaced, electrically, with spare cells, provided spare cells are included in the array. Providing on-chip spare cells to repair cell failures is known in the art as on-chip redundancy. A typical state of the art redundancy scheme has one or more spare rows (row redundancy) and/or one or more spare columns (column redundancy). These spare rows/columns have fuse programmable decoders that can be programmed to be responsive to the address of the defective row/column, while simultaneously disabling selection of the defective cell. Electrically, a repaired chip can not be discerned from a completely good chip.
Testing a memory chip to identify failed cells is complicated, requiring special test patterns, designed for identifying each type of failure. Because each of several test patterns must be written to and read from the array at least once, testing a memory chip can be time consuming. For example, on a 16 Mb RAM chip with a single (by one) Data In/Data Out (DI/DO) and an access time of 70 ns, testing a single test pattern on the over 16 million cells could take several seconds. Thoroughly exercising the array may take several minutes since testing requires many different test pattern variations. With hundreds of chips on a single semiconductor wafer, testing a single wafer may take hours. Furthermore, this testing is done more than once, at each step between the initial wafer screen (functional wafer test of each completed RAM site) and final shipment.
Additionally, test time increases as chips become denser, e.g. 64 Mb or 256 Mb. With each chip generation, density increases by a factor of four, i.e., 4.times.. Typically, that 4.times.increase corresponds to a 4.times.increase in addressable locations. However, each generation's performance improvement is usually less than 2.times.. So, with each generation, test time becomes longer and, therefore, even more of a problem.
For a variety of reasons, including a shortened array test time, these ultra dense RAMs are being organized with wide data paths of 32 bits (.times.32) or wider. This wide Input/Output (I/O) organization reduces array test time significantly because more cells are accessed during each cycle. Since more cells are accessed per cycle, fewer read/write cycles for each test pattern. For example, a single test pattern on a 64 Mb chip organized by 1 requires over 64 million write cycles to load the array and, then, 64 million read cycles, to verify that the array contains the stored test pattern. On the other hand, a wide I/O organization of 512 k.times.128 b, requires only 512 thousand write and 512 thousand read cycles, one read/write cycle for each 128 bits. Thus, a wide I/O organization, because it requires only a fraction of the number of test cycles, significantly reduces test time.
Besides reduced test time, modern system requirements provide additional impetus towards a wide I/O organization. State of the art microprocessors typically employ a 32 bit or 64 bit data word. A computer system architectured around one of these microprocessors usually requires 4-8 MBytes (MB) of Dynamic RAM (DRAM). 8 MB of memory for such a system, organized by 2 M by 32 can be made from four 16 Mb (2M.times.8) chips fairly simply. For example, a 2 M by 32 Single In-line Memory Module (SIMM) would use 4 chips 2M by 8 in parallel. However, a 64 Mb chip organized 8 M by 8 cannot be reconfigured so simply. Instead, a .times.32 SIMM organization from an 8 M by 8 requires additional complex logic at a substantial loss in performance. However, a wide I/O organization provides the optimum 64 Mb chip organization for use in a typical state of the art microprocessor based system, whether organized 2 M by 32, 1M by 64 or 512 K by 128. In fact, a 512 k by 128 organization provides concurrent access to four 32 bit words simultaneously. Even as chip densities increase to 256 Mb and beyond, new wider word architectures, such as the Very Long Instruction Word (VLIW) architecture with instructions 256 bits wide or wider, are coming to the forefront.
Still another reason dense chips tend toward a wide I/O DRAM organization is the performance requirement for DRAMs used with high performance microprocessors. Typical prior art DRAMs cannot meet this performance requirement. One state of the art approach to increasing Synchronous DRAM (SDRAM) throughput is known as "Prefetch". A Prefetch architectured SDRAM has a wider on-chip data path than its off-chip I/O, e.g. 64 bit on-chip paths vs. 32 bit off-chip. All array (on-chip) operations occur simultaneously (i.e., 64 bit array reads and writes) with off chip transfers done sequentially, i.e. two 32 bit transfers. Consequently, because wide I/O RAMs reduce test time, simplify memory system design and improve RAM performance, wide I/O RAMs are needed.
Unfortunately, prior art redundancy techniques are inadequate for wide I/O RAMs. There are several prior art approaches to providing column redundancy in RAM chips. In one prior art approach, spare columns are isolated in a small (redundant) array. Whenever the column address points to a defective column, a preprogrammed one of the spare columns is selected from the redundant array instead. See, for example, U.S. Pat. No. 4,727,516 entitled "Semiconductor Memory Device Having Redundancy Means" to Yoshida et al. incorporated herein by reference. However, Yoshida's approach is slow and requires a significant amount of extra logic. The extra logic is needed to determine whether the column address is pointing to a defective column and, if so, to bypass the defective column and select the preprogrammed spare column. A delay must be added to cell access time to allow the redundancy detect logic to determine whether the column address is pointing to a defective column and, if so, to select the correct spare column instead. While this redundancy approach was acceptable for narrow I/O chips (&lt;8 I/O), it is too slow, inflexible and cumbersome for use on wide I/O architecture.
Another redundancy approach is used when, as with denser arrays, the RAM array is organized hierarchically such that it is a group of smaller subarrays, e.g., the array might be divided into quadrants. In this second prior art redundancy approach, redundant columns are included with, and, dedicated to each subarray. Instead of substituting data from a separate subarray whenever a defective column is addressed, as in the first approach, a redundant column line within the subarray is selected.
FIG. 1 is a schematic representation of this second prior art redundancy scheme for a wide I/O, 16Mb DRAM chip. The chip 100 is organized with two Redundant Bit Lines (RBL) 102 and 104 providing two spare columns in each subarray 106. Each subarray 106 includes 2.sup.n Bit Lines (BL) 108 (where n is typically between 5 and 8) and redundant bit lines (2 in this example). Each of the subarrays 106 is part of a subarray block 110. All of the subarray blocks 110, collectively, form the entire RAM array. So, for example, a 16 Mb RAM has 16 blocks 110 of 1 Mb each. Block size, subarray size and the number of subarrays 106 per block 110 are interdependent and, selected based on performance and logic objectives.
This second prior art redundancy approach is not as slow as the first, but it is also not as flexible either. With the first prior art approach, any spare column in the block of redundant columns could be substituted for any defective column. With this second prior art approach, defective columns can only be replaced by spare columns in the same subarray. So, there must be at least one spare column for each subarray just to insure full chip coverage. Although the coverage afforded by this second approach may provide for replacing more than two defective columns, i.e. in different subarrays, two spare columns per subarray 106 only guarantees that two defective columns per chip are repairable. Three defective columns in the same subarray 106 is unrepairable.
Furthermore, in addition to being inflexible, this second approach does not eliminate the redundancy related timing delay. A subarray 106 is accessed when one word line 112 is selected and driven high. Data from accessed cells are provided simultaneously to the bit lines 108 and redundant bit lines 102 and 104. After a predetermined minimum delay, sufficient to allow the redundancy decoder to determine whether a spare column is addressed, a single bit line 108 or a redundant bit line 102, 104 is selected in each subarray 106. In each subarray, the selected bit line 108 or redundant bit line 102, 104 is coupled to a Local Data Line (LDL) 114. LDLs 114 are coupled to Master Data Lines (MDLs) 116. The MDLs 116 couple corresponding subarrays 106 in each subarray block 110. Data is transferred between the subarrays 106 and the chip I/O's on the MDLs 116.
Normally, bit select logic is faster than redundancy decode logic. However, even if both circuits were equally fast, with this second approach, bit line selection would have to be delayed to avoid timing conflicts known as race conditions. When a race condition occurs, the spare bit line 102 or 104 and the defective bit line are, for a short period, both connected to the LDL simultaneously and, thereby, shorted together. Problems from race conditions vary from slowing data (i.e. sensing whether a "1" or was stored), to inadvertently switching data stored in the array or causing wrong data to be read or written. To avoid race conditions, prior to bit line selection, a slight delay must be added to the chip timing. While significantly smaller than the delay required using the first prior art approach, this slight delay still requires intentionally slowing chip access time to include redundancy. Slowing chip access is counter to the high performance objectives for most RAMs.
Besides being inflexible and slowing chip access, this second prior art redundancy scheme is inefficient. In the 16 Mb chip of the example above, for every 2.sup.5 =32 bit lines 108, there are two redundant bit lines 102 and 104. At least 6.25% of the array area is dedicated to spare cells (this percent is higher if row redundancy is included). However, three defective columns in the same subarray 106 cannot be repaired, even though spare columns 102, 104 may remain unused in every other subarray 106. Thus, three defective columns in the same subarray renders an otherwise acceptable chip, unrepairable and, therefore, unusable.
Prior art redundancy schemes for wide I/O array chips are extensions of the above-described prior art schemes. These prior art redundancy schemes, which had limited advantages for narrow I/O RAMs, are inadequate for wide I/O RAMs or for prefetch type SDRAMs. As noted above, a wide I/O chip organization becomes even more necessary for ultra high density RAMs. Thus, there is a need for a wide I/O RAM architecture with flexible redundancy and improved test performance.