1. Technical Field
This invention relates to integrated circuit devices and, more particularly, to techniques for testing and repairing them.
2. Related Art
Real time image processing and two dimensional data processing, in general, require enormously high data throughput capabilities. Concurrent or parallel processing appears to be the most feasible route to achieve these extraordinary processing rates. There are additional constraints if these systems are to find widespread use. For example, they must be moderate in size and cost. While general purpose supercomputers such as the Cray XMP, Cyber 205, Hitachi S-810 and the like exhibit relatively fast processing speeds, they fall far short of the size and cost constraints that would make them economically attractive.
A cellular array three-dimensional computer (3-D computer) with one processor assigned to each pixel or matrix element of the input data is described in the following documents: Grinberg et al., "A Cellular VLSI Architectural", Computer pp. 69-81 (Jan. 1984); Little et al., "A Three Dimensional Computer For Image and Signal Processing", Proceedings of Workshop on Generic Signal Processing, Maryland pp. 88-92 (July 1987) and commonly assigned U.S. Pat. No. 4,275,410 entitled "Three-Dimensionally Structured Microelectronic Device" by Grinberg et al., issued June 23, 1981; U.S. Pat. No. 4,507,726 entitled "Array Processor Architecture Utilizing Modular Elemental Processors" by Grinberg et al., issued Mar. 26, 1985; U.S. Pat. No. 4 239,312 entitled "Parallel Interconnect for Planar Arrays" by Myer et al., issued Dec. 16, 1980; U.S. Pat. No. 4,524,428 entitled "Modular Input-Programmable Logic Circuits for Use in a Modular Array Processor" by Grinberg et al., issued June 18, 1985; U.S. Pat. No. 4,498,134 entitled "Segregator Functional Plane for Use in a Modular Array Processor" by Hansen et al., issued Feb. 5, 1985; and U.S. Pat. No. 4,745,546 entitled "Column Shorted and Full Array Shorted Functional Plane for Use in a Modular Array Processor and Method for Using Same" by Grinberg et al, issued May 17, 1988. The 3-D computer employs a very high degree of integration in its construction. This level of integration is made possible by, among other things, the development of technologies that permit massively parallel communication channels between silicon wafers and through silicon wafers. These channels enable the wafers to be stacked one on top of another to form a three dimensionally integrated computer.
FIG. 1 illustrates a schematic view of the aforementioned 3-D computer. The computer 10 generally consists of a plurality of stacked silicon wafers each containing an array of processing elements (PEs). All PEs on a particular wafer are identical yet they may differ from wafer to wafer. For example, the PEs on wafer 12 and 26 are an array of shift registers or shifters, the PEs on wafers 14, 16 and 20 are an array of accumulators, the PEs on wafers 24 are an array of comparators, the PEs on wafer 18 are an array of replicators, and the PEs on wafer 22 are an array of counters. Signals are passed vertically through the stack of wafers by bus lines composed of feed-throughs (signal channels through the wafers) and microbridges (signal channels between the wafers). In FIG. 1, the vertical lines represent the massively parallel communication channels. The vast majority of these communication channels are used for the N.times.N data bus line. The remaining communication channels are used for a control bus and address bus.
In the aforementioned 3-D computer 10 it is only necessary for the shifter wafers 12, 26 and replicator wafer 18 to provide for lateral communication between the PEs on each wafer, i.e., neighborhood connections between PEs. This interprocessor communication requirement increases circuit complexity thereby highlighting the need for increased yield and testability.
Since the aforementioned 3-D computer uses a wafer-scale integration approach, serious consideration must be given to the issue of yield. The use of redundant circuitry has been suggested in the past to improve functional yields. A redundant circuit is a spare circuit identical to the primary circuit which can be used in place of a faulty primary circuit.
Many redundancy approaches for array-type circuits have been proposed in the literature for both yield enhancement and fault tolerance. A good discussion of such techniques is discussed in Moore, "A Review of Fault-Tolerant Techniques for the Enhancement of Integrated Circuit Yield", Proc. of the IEEE, Vol. 74, No. 5, pp. 684-698 (May 1986). One of the earliest approaches used by memory designers is to include spare rows and/or columns as discussed in Cenker et al., "A Fault Tolerant 64K Dynamic Random Access Memory", IEEE Tran. Electron Devices, Vol. ED-27, No. 6, pp. 853-860 (June 1979). A complete row (column) is discarded when that row (column) contains defects. This approach, however, is very inefficient when applied to large two dimensional arrays such as the aforementioned 3-D computer. Recent developments involve more elaborate schemes of global reconfiguration, bypassing faulty PEs individually rather than the whole row or column. See, for example, L. Bentley and C. R. Jesshope, "The Implementation of a Two Dimensional Redundancy Scheme in a Wafer-Scale High-Speed Disk Memory", Wafer Scale Integration, C. R. Jesshope and W. R. Moore, Eds. Bristol, U.K.: Adam Hilger, 1986, pp. 187-197; R. A. Evans et al., "Wafer Scale Integration Based on Self-Organization", Wafer Scale Integration, C. R. Jesshope and W. R. Moore, Eds. Bristol, U.K.: Adam Hilger 1986, pp. 101-112; and M. Saml and R. Stefanelli, "Reconfigurable Architectures for SLSI Processing Arrays", Proc. of the IEEE, Vol. 74, No. 5, pp. 712-722, May 1986. These approaches usually do not require a large number of spare PEs (less than 50 percent of the number of primary PEs). Unfortunately, they do carry high area overhead in the form of complex switches and interconnects, while also requiring rather sophisticated global reconfiguration algorithms. Another approach is to provide interstitial redundancy where the redundant PEs are more uniformly distributed in the array and are local to the faulty elements. See, for example, A. D. Singh, "An Area Efficient Redundancy Scheme for Wafer Scale Processor Arrays", Proc. of IEEE Int'l Conf. on Computer Design, pp. 505-509 Oct. 1985.
In the 3-D computer referenced above, any redundancy scheme that requires complex overhead circuitry such as multiple-position switches and long interconnects would not be very area-efficient. Moreover, because of the vertical interconnecting structure between wafers, each system node consists of circuits from the corresponding locations of PEs in all of the wafers in the stack. Thus, those prior redundancy schemes which would require physical abandonment of any node would not be very feasible.
In the papers disclosing the aforementioned 3-D computer, a 1:1 redundancy scheme was discussed, i.e., one redundant PE for each primary PE. As the integration level increases (for example, towards a 128.times.128 PE array) the overhead of a 1:1 redundancy scheme is a heavy price to pay in terms of both area and testing time and the complexity of the testing apparatus. Thus, a 1:1 redundancy scheme may no longer be cost-effective as the integration level increases. In addition, the availability of only one spare PE per node may not be adequate to provide high yields, particularly if cluster defects exist.
As the level of integration increases, it also becomes more difficult to test each of the PEs due to their small size and more complex interconnection. This is especially true where they do not have an easily accessible external contact point which can be tested.