Electronics devices and capabilities have grown extremely common in daily life. Along with personal computers in the home, many individuals carry more than one productivity tool for various and sundry purposes. Most personal productivity electronic devices include some form of non-volatile memory. Cell phones utilize non-volatile memory in order to store and retain user programmed phone numbers and configurations when the power is turned off. PCMCIA cards utilize non-volatile memory to store and retain information even when the card is removed from its slot in the computer. Many other common electronic devices also benefit from the long-term storage capability of non-volatile memory in un-powered assemblies.
Non-volatile memory manufacturers that sell to the electronic equipment manufacturers require testers to exercise and verify the proper operation of the memories that they produce. Due to the volume of non-volatile memories that are manufactured and sold at consistently low prices, it is very important to minimize the time it takes to test a single part. Purchasers of non-volatile memories require memory manufacturers to provide high shipment yields because of the cost savings associated with the practice of incorporating the memory devices into more expensive assemblies with minimal or no testing. Accordingly, the memory testing process must be sufficiently efficient to identify a large percentage of non-conforming parts and preferably all non-conforming parts in a single test process.
As non-volatile memories become larger, denser and more complex, the testers must be able to handle the increased size and complexity without significantly increasing the time it takes to test them. Memory tester frequently run continuously, and test time is considered a major factor in the cost of the final part. As memories evolve and improve, the tester must be able to easily accommodate the changes made to the device. Another issue specific to testing non-volatile memories is that repeated writes to cells of the memories can degrade the overall lifetime performance of the part. Non-volatile memory manufacturers have responded to many of the testing issues by building special test modes into the memory devices. These test modes are not used at all by the purchaser of the memory, but may be accessed by the manufacturer to test all or significant portions of the memories in as little time as possible and as efficiently as possible. Some non-volatile memories are also capable of being repaired during the test process. The tester, therefore, should be able to identify: a need for repair; a location of the repair; the type of repair needed; and, must then be able to perform the appropriate repair. Such a repair process requires a tester that is able to detect and isolate a specific nonconforming portion of the memory. In order to take full advantage of the special test modes as well as the repair functions, it is beneficial for a tester to be able to execute a test program that supports conditional branching based upon an expected response from the device.
From a conceptual perspective, the process of testing memories is an algorithmic process. As an example, typical tests include sequentially incrementing or decrementing memory addresses while writing 0""s and 1""s into the memory cells. It is customary to refer to a collection of 1""s and 0""s being written or read during a memory cycle as a xe2x80x9cvectorxe2x80x9d, while the term xe2x80x9cpatternxe2x80x9d refers to a sequence of vectors. It is conventional for tests to include writing patterns into the memory space such as checkerboards, walking l""s and butterfly patterns. A test developer can more easily and efficiently generate a program to create these patterns with the aid of algorithmic constructs. A test pattern that is algorithmically coherent is also easier to debug and facilitates the use of logical methods to isolate portions of the pattern that do not perform as expected. A test pattern that is generated algorithmically using instructions and commands that are repeated in programming loops consumes less space in tester memory. Accordingly, it is desirable to have algorithmic test pattern generation capability in a memory tester.
Precise signal edge placement and detection is also a consideration in the effectiveness of a nonvolatile memory tester. In order to capture parts that are generally conforming at a median while not conforming within the specified margins, a non-volatile memory tester must be able to precisely place each signal edge relative in time to another signal edge. It is also important to be able to precisely measure at which point in time a signal edge is received. Accordingly, a non-volatile memory tester should have sufficient flexibility and control of the timing and placement of stimuli and responses from the Device Under Test (memory).
Memory testers are said to generate xe2x80x9ctransmitxe2x80x9d vectors that are applied (stimulus) to the DUT (Device Under Test), and xe2x80x9creceivexe2x80x9d vectors that are expected in return (response). The algorithmic logic that generates these vectors can generally do so without troubling itself about how a particular bit in a vector is to get to or from a particular signal pad in the DUT, as the memory tester contains mapping arrangements to route signals to and from pins.
Memory testers have interior test memory that is used to facilitate the test process. This interior test memory may be used for several purposes, among which are storing transmit vectors ahead of time, as opposed to generating them in real time, storing receive vectors, and storing a variety of error indications and other information concerning DUT behavior obtained during testing. (There are also housekeeping purposes internal to the operation of the memory tester that use SRAM and that may appear to fall within the purview of the phrase xe2x80x9cinterior memory.xe2x80x9d These are private to the internal operation of the tester, tend to not be visible at the algorithmic level, and are comparable to internal control registers. That memory is described as xe2x80x9cinterior control memory,xe2x80x9d and is excluded from what is meant herein by the term xe2x80x9cinterior test memory,xe2x80x9d which we use to describe memory used to store bit patterns directly related to the stimulus of, and response from, the DUT.) It is easy to appreciate that this interior test memory needs to operate at least as fast as the tests being performed; a very common paradigm is for the interior test memory (or some portion thereof) to be addressed by the same address (or some derivative thereof) as is applied to the DUT. What is then stored at that addressed location in interior test memory is something indicative of DUT behavior during a test operation performed on the DUT at that address. Algorithmic considerations within the test program may mean that the sequence of addresses associated with consecutive transmit vectors can be arbitrary. Thus, the interior memory needs to have the dual attributes of high speed and random addressability. SRAM comes to mind immediately as being fast, easy to control and tolerant of totally random addressing. Indeed, conventional memory testers have used SRAM as their interior test memory.
Unfortunately, SRAM is quite expensive, and this has limited the amount of interior test memory with which memory testers have had to work. The result is limits on memory tester functionality that are imposed by a shortage of memory. DRAM is significantly less expensive, but cannot tolerate random addressing and still perform at high speed.
DRAM can replace SRAM as the interior test memory in a memory tester. As briefly described in a simplified overview below, the problem of increasing the speed of DRAM operation for use as interior test memory can be solved by increasing the amount of DRAM used, in place of increasing its speed. Numbers of identical Banks of DRAM are treated as Groups. A combination of interleaving signals for different Banks of memory in a Group thereof and multiplexing between those Groups of Banks slows the memory traffic for any one Bank down to a rate that can be handled by the Bank. (For the reader""s convenience, we include a very abbreviated summary of this technique here, since much of its architectural aspects and associated terminology are useful in the explanation of the inventive subject matter that follows.) A three-way multiplexing between three Groups of four Banks each, combined with a flexible four-fold interleaving scheme for signal traffic to a Group produces an increase in operating speed approaching a factor of twelve, while requiring only three memory busses. A round robin strategy for choosing the next Group for the multiplexer is simple and assures that the interleaving mechanism for each Group has the time it needs to complete its most recently assigned task. All interleaved accesses within a Group are performed upon a next Bank (within that Group), also selected by a simple round robin selection. In this configuration, each of the twelve Banks represents a duplicate instance of the entire available address space, and any individual write cycle might end up accessing any one of the twelve Banks. An implication is that, at the conclusion of testing, all twelve Banks must be investigated to learn what failures happened during testing of the DUT, since the history of any address or collection of addresses of interest will be spread out across all twelve Banks. A particular channel is thus represented by twelve bits (one bit from each Bank and whose bit position within the word for that Bank is determined by the channel). It would be, however, awkward to have to (manually, as it were) individually consult all twelve Banks to discover failure information, so a utility mechanism has been provided to automatically xe2x80x9ccomposexe2x80x9d (merge) results of all twelve Banks during a read cycle at an address into a unified result that can be stored in one or all twelve Banks. This allows composed data to later be read at full speed. Full speed in one embodiment is a 100 MHZ rate for randomly addressed memory transactions.
If 33 MHZ is fast enough, then random access can be supported with just the interleaving and no multiplexing, in which case the composition mechanism and the memory addressing scheme are suitably adjusted. The addressing scheme changes to include extra Group selection bits that allow the depth of the memory to be three times deeper than for random 100 MHZ operation. These two modes of operation are called R1001 and R33, respectively. There is also an L100 mode of 100 MHZ operation to single Banks that relies on well behaved addresses being sent to the DRAM (an absolute minimum of row address changes).
At the top level of interior test memory organization there are four Memory Sets, each having its own separate and independent address space and performing requested memory transactions. Two are of DRAM as described above, and two are of SRAM. Each Memory Set has its own controller to which memory transactions are directed. As to externally visible operational capabilities, all four Memory Sets are essentially identical. They differ only in their size of memory space and how they are internally implemented: The SRAM Memory Sets do not employ multiplexing and interleaving, since they are fast enough to begin with. Despite their independence, Memory Sets of the same type (of SRAM or of DRAM) may be xe2x80x9cstacked,xe2x80x9d which is to say treated a one larger address space. This is done at the level of control above the Memory Sets themselves, in the algorithmic generation of the addresses and the decision as to which Memory Set to actually send a memory transaction. It is not as automatic as the way in which the Memory Sets and their controllers can stack groups to triple the address space as between the R100 and R33 modes of operation. For each of the Memory Set controllers, it has no clue that there even is such a thing as another Memory Set with another controller.
Thus it is that the interior test memory of the tester is divided into four Memory Sets, two of which are xe2x80x9cinternalxe2x80x9d SRAM""s and two of which are xe2x80x9cexternalxe2x80x9d DRAM""s. To be sure, all this memory is physically inside the memory tester; the terms xe2x80x9cinternalxe2x80x9d and xe2x80x9cexternalxe2x80x9d have more to do with a level of integration. The SRAM""s are integral parts of a VLSI (Very Large Scale Integration) circuit associated with the tester""s central functional circuitry, while the DRAM""s are individual packaged parts mounted adjacent the VLSI stuff. The amount of SRAM is fairly small, (say, around a megabit per Memory Set) while the amount of DRAM is substantial and selectable (say, in the range of 128 to 1024 megabits per Memory Set). The SRAM Memory Sets are always present, and may be used for any suitable purpose, such as storing the expected content of a DUT that is a ROM (Read Only Memory). The DRAM Memory Sets are actually optional, and are typically used for creating a trace for subsequent analysis leading to repair, although there are also other uses. The tester does not enforce a distinction between the SRAM and DRAM Memory Sets, as to different purposes for which they may be used. Those distinctions arise mostly as a matter of size. The SRAM Memory Sets are small, while the DRAM Memory Sets are potentially huge. The person or persons creating the test programming make the decisions concerning how the various Memory Sets are to be used.
The memory tester we have been describing is heavily pipelined. By pipelined we mean that an overall task or functionality is spread over an essentially serial path comprising some number of mechanisms (stages of the pipeline), each of which can accept both input circumstances and then produce corresponding output at a common rate. So, for example, at the top of the pipeline leading xe2x80x9cdownwardxe2x80x9d to the DUT is a Micro-Controller Sequencer that is the algorithmically programmable origin of test program execution. It provides xe2x80x9crawxe2x80x9d or xe2x80x9calgorithmic levelxe2x80x9d address that are eventually applied to the DUT, but perhaps only after some considerable manipulation. Those address, and sometimes data, pass through some ALU""s and (for addresses only) an Address Mapper may be applied to an Interior Test Memory and/or a Data MUX, and thence to an ADDRESS Bit Select circuit, a TRANSMIT VECTOR MAPPER/SERIALIZER and RECEIVE VECTOR COMPARE DATA Circuit, a Vector FIFO 45, and eventually to a Timing/Formatting and Comparison Circuit, where transmit vectors leave to be applied to the DUT via some Pin Electronics. Not all stages of the pipeline have the same delay. More complex operations in the pipeline toward the DUT might take more time within their associated stages. But, as with all the various stages, these are fixed delays, that once incurred, do not interfere with the overall end-to-end rate for the pipeline. We may say that there is yet another (and different) pipelined path that conveys receive vectors back xe2x80x9cupwardxe2x80x9d toward the environment of the executing test program.
When the memory tester is configured a certain way for a particular segment of a test program, as opposed to a different segment in that same program, the change in configuration can add or delete stages in the pipeline, alter the length of a pipeline stage in use at that time, all of which can affect overall combined pipeline delay for that associated segment of the test program. Those combined delays will be known, however, and they will not change for the duration of their associated test program segments.
We mentioned earlier that an algorithmic capability was desirable in developing and maintaining the test programs that are to be executed with the memory tester. The Micro-Controller Sequencer supports a compact form of test program with loops nested within loops and branching on test results. The use of a pipelined architecture, however, complicates certain capabilities that we are interested in, particularly in the way that errors caused by DUT behavior are understood and handled.
Consider the basic need to know that xe2x80x9cat this point in the program is where such and such a malfunction/unexpected result occurred. xe2x80x9d Let""s say that we ran the test program/DUT cycle rate slow enough that the pipeline delay from the Micro-Controller Sequencer to the Pin Electronics and then back again could simply be bundled into the slow rate of program execution. We would not even need to know that there was a pipeline. (And, it would be a pretty slow DUT, too!) Under these (mythical) circumstances, the current state of the test program and the activity at the DUT are in occurring not only in some time/rate relation to one another (synchronized), but also in a state of xe2x80x9csequential unisonxe2x80x9d where a cause and its effect are never separated by an instruction fetch for the test program. Under these rosy circumstances, if the DUT fails, the next step in the program can evaluate a response and decide then and there that the preceding step in the program is where the error occurred. What is done next would depend upon the programmer""s intent. He may wish to simply gather (report or preserve) for later consideration or analysis the basic information, such as what were the applied addresses and other relevant stimulus information that, in addition to knowing that xe2x80x9cit was that step in the programxe2x80x9d (as indicated by a program pointer and perhaps some loop indices) where the evil occurred. Alternatively, the programmer may have anticipated that this might happen and has already provided yet other test program segments whose execution is intended to deal with whatever it is. So the program would branch to someplace. Both of these desirable actions are indeed possible in this contrived example, because the property of xe2x80x9csequential unisonxe2x80x9d ensures that a cause (some stimulus in the test program) and its effect (some corresponding result from the DUT) are never separated by an instruction fetch for the test program.
But the DUT""s to be tested are not that slow, and the operator of the memory tester wants to test them at the speeds at which they will be used. Furthermore, the test program might not enquire immediately as to whether or not an earlier stimulus produced an associated error. Aside from building super fast hardware to retain the property of sequential unison, we have no real other choice but to allow the delays of the pipelines to become visible. But the price we pay is that the answer to the basic question xe2x80x9cwhere in the program did it fail . . . xe2x80x9d is a good deal tougher to provide, as are the mechanisms needed to allow the task of branching on an error. In any case, the test program at the top of the pipeline will have already advanced beyond the place where it provided the stimulus. It may have already provided several such stimuli, and it may have undergone conditional branching since then. Even when an error flag does eventually get set and is later tested by the program, just which event is it correlated with? And what were the various address, etc. at that point it time? They are most likely different now than they were then. It is a lot of trouble to back all that stuff up and find out. How can we deal effectively with the fact that events in the test program are running ahead of the effects they are causing in the DUT? If this cannot be resolved it will be a rather significant wart on the memory tester.
What to do?
The problem is to branch back to an appropriate location within a memory tester test program, and also restore its state of algorithmic control, when an error associated therewith occurs later in time at the DUT. Owing to delays in a transmit vector pipeline connecting address and data stimuli from the program execution environment to the DUT and also owing to further delays in a receive vector pipeline connecting responses from the DUT back to the execution environment of the test program, these delays allow the program to arbitrarily advance beyond where the stimulus was given. The arbitrary advance makes it difficult to determine the exact circumstances that were associated with the error. A branch based on the error signal can restart a section of the test program, but it is likely only a template needing further test algorithm control information that varies dynamically as the test program executes. The solution is to equip the memory tester with various History FIFO""s whose depths are adjusted to account for the sum of the delays of the transmit and receive vector pipelines, relative to the location of that History FIFO. When the error flag is generated the desired program state and algorithmic control information is present at the bottom of an appropriate History FIFO, and can be used as desired. This technique is readily applicable to the case where the test program uses an ALU to generate its own DUT stimuli (is entirely self-contained), as well as to the case where the test program/ALU addresses an intermediate Buffer Memory whose contents are central to the nature of the testing the DUT is to undergo. In the first case there is an ALU History FIFO, while in the second there is a Buffer Memory History FIFO. (The term xe2x80x9cHistory FIFOxe2x80x9d is a generalization and the name of a class of particular FIFO""s. There is no FIFO named xe2x80x9cHistory FIFOxe2x80x9d; there are only the specific members of the class.)
Furthermore, the writer of test programs would most certainly prefer not to be bothered with discovering the various pipeline depths that ensue for different memory tester configurations that are arranged by the program for different tests that are performed. Accordingly, there is a mechanism to track configuration as it occurs and adjust the depths of the various History FIFO""s accordingly. There also needs to be some xe2x80x9cstartxe2x80x9d mechanism at the test program level to indicate when a stimulus is being issued for which there will later be a check for an associated error, as the degree to which a History FIFO is filled to a desired depth-in-use is determined subsequent to that xe2x80x9cstartxe2x80x9d indication.
The test program might not enquire if there was an error for quite some time after the associated stimulus. If a History FIFO is allowed to continue to store new stimuli in the interim, but after an error occurs, the desired correspondence will be lost. Accordingly, there is a mechanism to freeze the contents of a History FIFO upon the generation of an error.
ECR""s (Error Catch RAM""s) are often filled during actual DUT testing and then later investigated with further test program activity that does not actually exercise the DUT. The various pipeline delays can make it awkward to determine the address that was applied to the DUT that produced a particular error indication logged in the ECR. The conventional utility operations provided for this are too slow.
The History FIFO mechanism can be applied to ECR investigations and an ECR History FIFO will provide the answer without incurring any time penalty.
Finally, the overhead needed to implement a History FIFO can be extended to allow a branching instruction in the test program to not prematurely respond to an error flag sooner than the pipeline delay needed for that error flag""s value to be determined by a cause located within the test program.