Electronics devices and capabilities have grown extremely common in daily life. Along with personal computers in the home, many individuals carry more than one productivity tool for various and sundry purposes. Most personal productivity electronic devices include some form of non-volatile memory. Cell phones utilize non-volatile memory in order to store and retain user programmed phone numbers and configurations when the power is turned off. PCMCIA cards utilize non-volatile memory to store and retain information even when the card is removed from its slot in the computer. Many other common electronic devices also benefit from the long-term storage capability of non-volatile memory in un-powered assemblies.
Non-volatile memory manufacturers that sell to the electronic equipment manufacturers require testers to exercise and verify the proper operation of the memories that they produce. Due to the volume of non-volatile memories that are manufactured and sold at consistently low prices, it is very important to minimize the time it takes to test a single part. Purchasers of non-volatile memories require memory manufacturers to provide high shipment yields because of the cost savings associated with the practice of incorporating the memory devices into more expensive assemblies with minimal or no testing. Accordingly, the memory testing process must be sufficiently efficient to identify a large percentage of non-conforming parts and preferably all non-conforming parts in a single test process.
As non-volatile memories become larger, denser and more complex, the testers must be able to handle the increased size and complexity without significantly increasing the time it takes to test them. Memory testers frequently run continuously, and test time is considered a major factor in the cost of the final part. As memories evolve and improve, the tester must be able to easily accommodate the changes made to the device. Another issue specific to testing non-volatile memories is that repeated writes to cells of the memories can degrade the overall lifetime performance of the part. Non-volatile memory manufacturers have responded to many of the testing issues by building special test modes into the memory devices. These test modes are not used at all by the purchaser of the memory, but may be accessed by the manufacturer to test all or significant portions of the memories in as little time as possible and as efficiently as possible. Some non-volatile memories are also capable of being repaired during the test process. The tester, therefore, should be able to identify: a need for repair; a location of the repair; the type of repair needed; and, must then be able to perform the appropriate repair. Such a repair process requires a tester that is able to detect and isolate a specific nonconforming portion of the memory. In order to take full advantage of the special test modes as well as the repair functions, it is beneficial for a tester to be able to execute a test program that supports conditional branching based upon an expected response from the device.
From a conceptual perspective, the process of testing memories is an algorithmic process. As an example, typical tests include sequentially incrementing or decrementing memory addresses while writing 0's and 1's into the memory cells. It is customary to refer to a collection of 1's and 0's being written or read during a memory cycle as a “vector”, while the term “pattern” refers to a sequence of vectors. It is conventional for tests to include writing patterns into the memory space such as checkerboards, walking 1's and butterfly patterns. A test developer can more easily and efficiently generate a program to create these patterns with the aid of algorithmic constructs. A test pattern that is algorithmically coherent is also easier to debug and use logical methods to isolate portions of the pattern that do not perform as expected. A test pattern that is generated algorithmically using instructions and commands that are repeated in programming loops consume less space in tester memory. Accordingly, it is desirable to have algorithmic test pattern generation capability in a memory tester.
Precise signal edge placement and detection is also a consideration in the effectiveness of a non-volatile memory tester. In order to capture parts that are generally conforming at a median while not conforming within the specified margins, a non-volatile memory tester must be able to precisely place each signal edge relative in time to another signal edge. It is also important to be able to precisely measure at which point in time a signal edge is received. Accordingly, a non-volatile memory tester should have sufficient flexibility and control of the timing and placement of stimuli and responses from in the Device Under Test (memory).
Memory testers are said to generate transmit vectors that are applied (stimulus) to the DUT (Device Under Test), and receive vectors that are expected in return (response). The algorithmic logic that generates these vectors can generally do so without troubling itself about how a particular bit in a vector is to get to or from a particular signal pad in the DUT. At this level it is almost as if it were a certainty that adjacent bits in the vector would end up as physically adjacent signals on the DUT. Life should be so kind!
In reality, the correspondence between bits in a vector at the “conceptual level” and the actual signals in the DUT is apt to be rather arbitrary. If nothing were done to prevent it, it might be necessary to cross one or more probe wires as they descend from a periphery to make contact with the DUT. Such crossing is most undesirable, and it is conventional to incorporate a mapping mechanism in the path of the transmit vector to rearrange the bit positions in the transmit vector before they are applied to the DUT, so that task of making physical contact is not burdened with crossings. Receive vectors are correspondingly applied to a reverse mapping mechanism before being considered. In this way the algorithmic vector generation and comparison mechanisms can be allowed to ignore this entire issue. As another example of what such mappers and reverse mappers can do, consider the case when a different instance of the same type of DUT is laid out on the same wafer, but with a rotation or some mirrored symmetry, in order to avoid wasting space on the wafer. These practices also have an effect on the correspondence between vector bit position and physical signal location, but which can be concealed by the appropriate mappings and reverse mappings. It will be appreciated that the mappings and reverse mappings needed for these situations are, once identified for a particular DUT, static, and need not change during the course of testing for that particular DUT.
Memory testers have interior test memory that is used to facilitate the test process. This interior test memory may be used for several purposes, among which are storing transmit vectors ahead of time, as opposed to generating them in real time, storing receive vectors, and storing a variety of error indications and other information concerning DUT behavior obtained during testing. (There are also housekeeping purposes internal to the operation of the memory tester that use SRAM and that may appear to fall within the purview of the phrase “interior memory.” These are private to the internal operation of the tester, tend to not be visible at the algorithmic level, and are comparable to internal control registers. That memory is described as “interior control memory,” and is excluded from what is meant herein by the term “interior test memory,” which we use to describe memory used to store bit patterns directly related to the stimulus of, and response from, the DUT.) It is easy to appreciate that this interior test memory needs to operate at least as fast as the tests being performed; a very common paradigm is for the interior test memory (or some portion thereof) to be addressed by the same address (or some derivative thereof) as is applied to the DUT. What is then stored at that addressed location in interior test memory is something indicative of DUT behavior during a test operation performed on the DUT at that address. Algorithmic considerations within the test program may mean that the sequence of addresses associated with consecutive transmit vectors can be arbitrary. Thus, the interior memory needs to have the dual attributes of high speed and random addressability. SRAM comes to mind immediately as being fast, easy to control and tolerant of totally random addressing. Indeed, conventional memory testers have used SRAM as their interior test memory.
Unfortunately, SRAM is quite expensive, and this has limited the amount of interior test memory with which memory testers have had to work. The result is limits on memory tester functionality that are imposed by a shortage of memory. DRAM is significantly less expensive, but cannot tolerate random addressing and still perform at high speed. DRAM is internally organized to require the lengthy pre-charging of an addressed “row” with RAS (Row Address Strobe), followed by specifying an addressed “column” with CAS (Column Address Strobe). A memory controller converts a unified address into row and column components to be applied with RAS and CAS. DRAM is often suitably fast if, once a row has been pre-charged, further addressing can be confined to columns along that row (i.e., further instances of CAS, but none of RAS). However, such an algorithmic restriction on tester operation (which interferes with the ability to arbitrarily address the DUT) is generally unacceptable, and therefore cannot be relied on to provide the high speed operation needed for use as interior test memory within a memory tester. It would be desirable if by using DRAM the size of the interior test memory could be both increased and its costs reduced, which benefits could be realized if there were a way to operate DRAM's with arbitrary addressing at the same rate as commonly expected of the more expensive SRAM's.
DRAM can replace SRAM as the interior test memory in a memory tester. As described in greater detail below, the problem of increasing the speed of DRAM operation for use as interior test memory can be solved by increasing the amount of DRAM used, in place of increasing its speed. Numbers of identical Banks of DRAM are treated as Groups. A combination of interleaving signals for different Banks of memory in a Group thereof and multiplexing between those Groups of Banks slows the memory traffic for any one Bank down to a rate that can be handled by the Bank. (For the reader's convenience, we include a brief summary of this technique here, since much of its architectural aspects and associated terminology are useful in the explanation that follows.)
A three-way multiplexing between three Groups of four Banks each, combined with a flexible four-fold interleaving scheme for signal traffic to a Group produces an increase in operating speed approaching a factor of twelve, while requiring only three memory busses. A round robin strategy for choosing the next Group for the multiplexer is simple and assures that the interleaving mechanism for each Group has the time it needs to complete its most recently assigned task. All interleaved accesses within a Group are performed upon a next Bank (within that Group), also selected by a simple round robin selection. In this configuration, each of the twelve Banks represents a duplicate instance of the entire available address space, and any individual write cycle might end up accessing any one of the twelve Banks. An implication is that, at the conclusion of testing, all twelve-Banks must be investigated to learn what failures happened during testing of the DUT, since the history of any address or collection of addresses of interest will be spread out across all twelve Banks. A particular channel is thus represented by twelve bits (one bit from each Bank and whose bit position within the word for that Bank is determined by the channel).
It would be, however, awkward to have to (manually, as it were) individually consult all twelve Banks to discover failure information, so a utility mechanism has been provided to automatically “compose” (merge) results of all twelve Banks during a read cycle at an address into a unified result that can be stored in one or all twelve Banks. This allows composed data to later be read at full speed. Full speed in one embodiment is a 100 MHZ rate for randomly addressed memory transactions.
If 33 MHZ is fast enough, then random access can be supported with just the interleaving and no multiplexing, in which case the composition mechanism and the memory addressing scheme are suitably adjusted. The addressing scheme changes to include extra Group selection bits that allow the depth of the memory to be three times deeper than for random 100 MHZ operation. These two modes of operation are called R100 and R33, respectively. There is also an L100 mode of 100 MHZ operation to single Banks that relies on well behaved addresses being sent to the DRAM (an absolute minimum of row address changes).
At the top level of interior test memory organization there are four Memory Sets, each having its own separate and independent address space and performing requested memory transactions. Two are of SDRAM as described above, and two are of SRAM. Each Memory Set has its own controller to which memory transactions are directed. As to externally visible operational capabilities, all four Memory Sets are essentially identical. They differ only in their size of memory space and how they are internally implemented: The SRAM Memory Sets do not employ multiplexing and interleaving, since they are fast enough to begin with. Despite their independence, Memory Sets of the same type (of SRAM or of DRAM) may be “stacked,” which is to say treated a one larger address space. This is done at the level of control above the Memory Sets themselves, in the algorithmic generation of the addresses and the decision as to which Memory Set to actually send a memory transaction. It is not as automatic as the way in which the Memory Sets and their controllers can stack groups to triple the address space as between the R100 and R33 modes of operation. For each of the Memory Set controllers, it has no clue that there even is such a thing as another Memory Set with another controller.
Thus it is that the interior test memory of the tester is divided into four Memory Sets, two of which are “internal” SRAM's and two of which are “external” DRAM's. To be sure, all this memory is physically inside the memory tester; the terms “internal” and “external” have more to do with a level of integration. The SRAM's are integral parts of a VLSI (Very Large Scale Integration) circuit associated with the tester's central functional circuitry, while the DRAM's are individual packaged parts mounted adjacent the VLSI stuff. The amount of SRAM is fairly small, (say, around a megabit per Memory Set) while the amount of DRAM is substantial and selectable (say, in the range of 128 to 1024 megabits per Memory Set). The SRAM Memory Sets are always present, and may be used for any suitable purpose, such as storing the expected content of a DUT that is a ROM (Read Only Memory). The DRAM Memory Sets are actually optional, and are typically used for creating a trace for subsequent analysis leading to repair, although there are also other uses. The tester does not enforce a distinction between the SRAM and DRAM Memory Sets, as to different purposes for which they may be used. Those distinctions arise mostly as a matter of size. The SRAM Memory Sets are small, while the DRAM Memory Sets are potentially huge. The person or persons creating the test programming make the decisions concerning how the various Memory Sets are to be used.
It was mentioned above that the DUT may well be susceptible of repair. This is often true even for undiced memory chips that are still part of a wafer. How this is actually achieved on the circuit level is well understood by those who manufacture such devices, so it is sufficient for us to simply say that incorporated into those devices are some number of selectably destroyable elements whose destruction enables gating that in turn alters the internal logic of an associated circuit. This ability is used to route internal signals to replacement circuits that substitute for defective ones. This capability cannot be economically worth while unless the repair can be made with less time and effort that would be required to make a new part; otherwise it would be more cost effective to simply jettison the bad part into the scrap barrel. In particular, it is undesirable to involve a human technician in the processes of understanding the particular failures in the bad parts within a production stream and of being responsible for deciding how to repair them. Instead, an algorithmic mechanism (program and associated hardware) in the memory tester can be developed to analyze the failure and attempt its repair. The repaired part can be re-tested on the spot, and its fate decided.
Such a mode of operation has certain implications for the design of the memory tester. Real time detection of failures can be used to set flags and alter test algorithms to refine the understanding of the failure. That is, tests performed to verify proper operation might not be the ones best suited to discover why the part is failing in the first place. The memory tester needs to be able to create a trace (that is, a usable record) of test data for an automated analysis (whether performed immediately or at the conclusion of a larger test process) that determines whether to attempt a repair, and if so, what actions to take in making the repair. Typically, the attempt at repairs is postponed until after at least a preliminary testing reveals the scope or number of probable failures. The number of replacement circuits available is limited (say, half a dozen or so, as determined by an odds-driven cost benefit analysis), and there is no point in attempting to fix a part that can be shown to need more help than is available. All of this takes place in light of the understanding that “Time on the tester is $$!” and that what memory manufacturers need are testers that test thoroughly, but in an absolute minimum of time. As a consequence, the phrase “create a trace of test data for an automated analysis” describes a process that, far from being considered a unified activity, has itself been subjected to extensive analysis to minimize the time required to test a part, and if indicated, repair it. The simple conception of a memory tester as a general purpose programmable mechanism (e.g., a CPU and memory) interfaced to some controllable test bed for exercising a DUT has long since ceased to be economically viable for the high volume testing of memories. Too large a percentage of time is spent in CPU execution and the overhead logic required to generate stimuli and evaluate their responses. Much dedicated hardware has been incorporated into memory testers to enable them to run fast, and the general purpose programmable mechanism is now generally relegated to tasks concerning control at the supervisory level.
If the testing of the DUT is to be performed at high speed and without unnecessary pauses, it is clear that tester's interior test memory used to create a trace describing failures has to operate at the same high speeds used to test the DUT. In memory testers of the sort to be described herein, a portion of interior test memory that stores test response data in addresses corresponding to those tested in the DUT is called an ECR (Error Catch RAM). It is easy to see why the content of an ECR can be thought of as a trace of test results. However, it would be a mistake to construe tester operation as simply trace creation through stimulus followed by after-the-fact trace analysis that dwells on every address. While indeed useful for certain aspects of DUT testing, such a model is too slow, and for certain tasks is simply too cumbersome for high speed production testing. One central theme found in ways to augment the notion of a trace captured in an ECR is the use of dedicated hardware to categorize and index (think: recognize and then store) various errors, in real time as they happen. These various errors occur along the organizing architectural principles that are internal to the particular DUT being tested. This strategy significantly reduces the complexity of the analysis task, as well as reducing test time. This strategy uses interior test memory called Tag RAM's to store an indexed collection of detected events for later inspection.
A conventional memory tester can have many uses for interior test memories of its own, of which ECR's and Tag RAM's are but two. We now examine the nature of some of these uses for internal memories, and will arrive at the conclusion that improvement in the architecture of a conventional memory tester's interior test memory is desirable.
In operation an ECR is: (1) addressed by the same address as, or by an address derived from, the address that is applied to the DUT; and, (2) has a native data word width in bits at least that of the DUT. The effective word width is adjustable along powers of two (eight, sixteen, thirty-two), with such W adjustability accompanied by a corresponding inverse change in addressability. This feature is termed “narrow word”.
When a test channel for the DUT (a bit in an output word, or some other signal of interest) compares or fails to compare to expected results a corresponding bit at that address in the ECR is either set or cleared, according to the convention in use. We store a zero to represent a failure to compare. As thus organized, the ECR has not got a multi-bit value for each address/channel combination, and can instead store just a single bit's worth of information for each such combination, no matter how many times that combination may be accessed during a test. Test strategy enters into what the bit means and how it is maintained. The bit might represent the dichotomy “it never failed/it failed at least once” for an entire multi-access test, or it might represent the outcome of the last access (i.e., test) only, even if that is at variance with earlier tests. If quantity information is desired about failures for a certain address/channel, some additional resource (a counter) must be allocated to record it.
Tag RAM's are another way of recording in a tester's interior test memory information about how the DUT responds while being tested. A Tag RAM generally has a much smaller address space than an ECR does, and is typically addressed by a “classified address” that is derived from the one applied to the DUT. The derivation reflects the existence of some organizational principle inside the DUT, and is termed “address classification” in this Specification. The data stored in the Tag RAM is formed by the detection of some condition occurring in the DUT's response to some stimulus, and is again usually derived by application of knowledge about internal DUT operation by process called “data classification”. The idea is to recognize some condition or event, which is probably a member of a whole family (universe) of possible occurrences wherein different members of the family have different addresses, and then store information about test results. This produces a Tag RAM whose contents are useful abstractions related to DUT organization and that are indexed according to aspects of that. DUT organization. Different families of failure types are represent by different Tag RAM's.
As an example of a Tag RAM, consider that an address applied to the DUT might be separable into X, Y and Z components that relate to internal organization of the DUT. The address applied to the DUT has the X, Y and Z addresses embedded therein, but perhaps not in an obvious or convenient way. But suitable gating circuits can extract, say, the Y address and apply it as an address to a Tag RAM. We can now store information that is indexed according to Y address. That information might be a single bit whose end-of-test meaning is that a failure occurred at least once at that Y address, or it might be a multi-bit value having some other interpretation. By having Tag RAM's for X, Y and Z one can obtain useful information about the failures in a DUT whose internal organization includes the notions of X, Y and Z addresses. Furthermore, a significant reduction in memory requirements is realized, as the needed Tag RAM's consume a number of locations equal to only the sum of the X, Y and Z address spaces, rather than equal to their product, which is what an ECR would have to have.
To continue with this example, data classification can further increase the usefulness of Tag RAM's. Suppose that the DUT is an eight bit wide memory having internal X and Y address mechanisms. Internally the DUT is eight one-bit memories, each having the same X and Y addressing mechanisms, and each providing its output data to a different one of eight pins. It is useful to ask “For each Y address, was there ever a failure on any of those eight pins?” That is, we desire data classification that OR's those eight pins together. Our term for this mode of behavior is “compression”, and clearly it needs to be configurable. Next, suppose that we have a tester that has a native word width of thirty-two bits; we can wish to test four of these eight-bit DUT's at a time by partitioning that native word width into four eight-bit segments, one for each DUT. Now we want to do the OR'ing four times, but on the four different segments as if each were the only segment, and respectively send the results to four different Tag RAM's. Our term for this mode of behavior is “masking”, and clearly it also needs to be configurable to match the different DUT's that may be tested. Finally, the examples set out here show the need for four Tag RAM's (one for each DUT) that are addressable by Y (however many bits that needs), and are eight bits wide. If conventional Tag RAM's for X and Z were also desired, then there would need to be twelve such Tag RAM's. Conventional Tag RAM's have been stand-alone separate memories included in the tester.
As such, they are dedicated to, and configured ahead of time for, particular tasks and are not easily adapted for use in differing circumstances, if indeed they can be adapted at all.
Buffer Memory is another kind of interior test memory usage often found in a memory tester. It can be used to store an image of either stimuli or responses that can be found or given ahead of time. An example is the content of a ROM (Read Only Memory). Buffer Memory can be used either in place of or in conjunction with algorithmically generated test patterns.
In a conventional memory tester these different kinds of interior test memory usage have been realized by including separate memory mechanisms in the tester, each dedicated to its own particular purpose. This is both aggravating and wasteful, since it will often be the case that unused memory will be present but will not be available for a different desired function. Furthermore, such conventional ECR's, Tag RAM's and buffer memories have heretofore been realized with SRAM which, while fast and easy to control, is relatively expensive. SRAM is accessed using a single unified address, and it is faster than DRAM when arbitrarily addressed, but is also considerably more expensive. It would be desirable if by using DRAM the sizes of the ECR, Tag RAM and buffer memories could be both increased and their costs reduced, which benefits could be realized if there were a way to operate DRAM's with arbitrary not addressing at the same rate as commonly expected of the more expensive SRAM's.
How to replace SRAM with DRAM in the interior test memory of a memory tester was briefly described above, and is the subject of considerable material below. The technique described herein emphasizes an ECR as a principle example, but is by no means limited to DRAM used as an ECR. It will become abundantly clear that the DRAM Memory Sets can also be used to provide high speed, low cost, reconfigurable interior test memory that can be used to provide Tag RAM's and buffer memories. That done, it would be desirable if arbitrarily many different instances of all these different uses of interior test memory within a memory tester could be allocated and reconfigured as needed from a central collection of memory, rather than existing as separate pre-configured memory mechanisms.
Furthermore, having such a fair-sized pile of low-cost, fast and re-configurable interior test memory at our disposal raises the question of what else we might do to improve memory tester operation. In particular, it would be desirable if there were a way of reducing the effort needed to write and execute a test program. For example, could we reduce the specificity and complexity that attaches to a test program that needs to both generate stimuli and their expected responses for a device to be tested? We are, after all, testing memories, and we now have at our disposal a fair amount of interior test memory. Just as the notions of address and data classification driving Tag RAM's removes an analysis burden from the test program, perhaps there is an additional way to simplify some of the various memory test programs if sufficient interior tester memory is available.
What to do?