Technical Field of the Invention
The present invention relates to a configurable memory architecture with a built-in testing mechanism integrated in a memory core.
Description of Related Art
Technology trends in semiconductor memories follow MOORE's Law and thus every year circuit geometries shrink by roughly a factor of 1.5. Memories are also increasingly used as embedded structures in Application Specific Integrated Circuits (ASICs).
Today's embedded memories are full custom volatile/nonvolatile memories that are designed and optimized for high speed, low power and small area. There are two types of embedded memories: synchronous and asynchronous. The operations inside the asynchronous memories are not synchronized to a clock. On the contrary, synchronous memories work on the edge of the clock. Also, volatile memories can be divided based upon the nature of the storage element and detection techniques, into two categories namely: Static Random Access Memories (SRAMs) and Dynamic Random Access Memories (DRAMs). SRAMs contain storage elements that retain the values stored in them until either power is removed or the data is over written. DRAMs on the other hand contain storage elements that require dynamic refreshing at regular intervals to retain the value stored therein. In the present discussion the focus is on synchronous single port SRAMs and DRAMs. Nevertheless, the same fundamentals can also be applied to asynchronous memories.
A typical memory contains a data input port, an address port, a clock, a memory select port, an output enable port, a write enable port and an output port. The block diagram of a RAM is shown in the FIG. 1. It contains a decoder section 1, a core section 2, an I/O section 4 and control/clock generation section 3.
Various blocks like an X Decoder, Y Decoder, Pre-Decoder, Core, I/O's/Precharge/Read/Write Logic, Clock Generation Circuits, Dummy Paths, Refresh Logic and Pipe Lines are made in the form of leaf cells that are abutted to form a memory block of a desired “word x bit” configuration. The full memory development thus pertains to making these leaf cells with a new technology and tuning them according to the full range of words and bits provided. The normal decoder of the memory contains the latches to latch the address inputs. The addresses are statically decoded and the valid decoded value is clocked to finally select a row or column (or both) to write into or read from the respective cells. In the case of dynamic decoders the row select signal, called the “wordline,” is subjected to a precharge mechanism using an internally generated memory clock. The write unit of the Memory can be as simple as a latch to store the value at the data input port on the positive edge of the clock, and a strong buffer to pull the column select (architecture specific) signal or the bitline, to the latched value. The DRAM write involves a more complex structure as the data to be written on a row is first copied onto a row register and then written. Periodic refresh cycles are simply read-write operations to the DRAM memory cell. The read circuit can be as simple as a digital inverter to sense the discharge of the precharge of the bitline or as complex as a differential/latch type, single/double ended sense amplifier which is analog in nature. For a DRAM, the process of reading is complex since the read is destructive. In this case again a latch can be used to hold the value read out from the memory in a particular cycle, until the next operation at the next positive edge of the memory clock. The core can be a group of cells, which can be single latches, or six, four, eight transistor complex latches or a capacitor.
Testing of the embedded RAMs is increasingly becoming a bottleneck in the overall quality of the user end ASIC. Current implementations employ a built-in self test (BIST) circuit external to the RAM structure. This implementation of an external BIST or design for test (DFT) mechanism, results in increase of the setup, hold and access times of the RAM. This creates the problem in timing estimation before the BIST insertion.
A second type of memory structure known as a “register file” typically contains data input ports and address ports for writing, data output ports and address ports for reading (in case of FIFOs or stacks there are no address ports), an I/O section, decoders or counters, pipelines, clock generation circuits in case of self timed architecture and refresh logic. The block diagram of such a memory is shown in FIG. 2. The typical structure of these memories in addition to normal memories contains: 1. READ DECODER/COUNTERS; 2. WRITE DECODER/COUNTERS; 3. PRE DECODERS.
The normal decoder of the register file contains latches for the address and data inputs, write enable or read enable signals. In the case of FIFOs or LIFOs there are no address ports as the memory self-generates the addresses sequentially. In the case of address ports, addresses are statically or dynamically decoded and the valid decoded value is clocked to finally select a row or column (or both) to write onto or read from the respective cells. Dynamic decoders work in the similar fashion as described above with reference to single port memories.
The pointer generation in the register file memory is a complex operation. In the case of a FIFO there are generally two pointers, one is for read and the other is for write. At the beginning when there is no data in the FIFO, the position of the write pointer coincides with the position of read pointer. Once a write operation is performed, the write pointer marches ahead of the read pointer making space for the read pointer to move further. The FIFO becomes full when the write pointer comes exactly one location behind the read pointer. In this condition further write operation cannot be performed.
In some cases half full cases or half empty cases are also generated. To generate the Empty flag, the read pointer and the write pointer are compared to find equality. To generate the Full flag, the write pointer is compared to the read pointer to determine if the write pointer value is one less than the read pointer value. The major difficulty arises due to the fact that the clocks for the read and write pointers are normally different. In the aforesaid case, the flags act as synchronization signals for the two clock domains.
In case of LIFO, it is always the reverse. The read pointer always moves along with the write pointer when a write operation happens. Once a read operation happens, the write pointers as well as the read pointer both move backwards. The clocks for read and write cannot be different in case of a stack or some complex arrangement has to be made. Thus there are lots of complexities in designing such high speed and best in class register file memories.
Thus despite the fact that the aforesaid memories are high density, high speed, low power, they have the following disadvantages:
1. These memories cannot test themselves. They need a BIST or some other external circuit on chip or off chip to test them.
2. The fact that they are tested by some external means an increase the effort on the user end in terms of following: a. Place & Route; b. Timing Analysis before and after the test insertion; c. Clock management.
3. The routing congestion as well as placement constraint is sometimes painful for the person integrating the test with memory.
4. In case the user wants to achieve a particular speed for the integrated test to be able to perform an at-speed test, extra area is needed which also increases the routing congestion due to extra clock buffering and routing.
5. The test algorithms in the external test mechanism are most of the times coded to get the best fault coverage, based upon the fault model chosen upon various criteria like, Memory architecture, Process, Technology, Memory size etc. But as the process changes or matures, the need for a very exhaustive fault model decreases. Thus there must be a mechanism to alter the algorithm of the test mechanism.
6. Most of the time, external test patterns are needed to test the BIST or the test circuitry itself, before using it to test the other blocks. These tests are an additional cost over the existing set of test patterns, in terms of extra time as well as extra tester memory/register file.
7. Another thing is that these memories are present on chip in large numbers and sometimes are scattered around the chip. Thus sharing the test mechanism is a tedious job and it also increases routing congestion.
8. Another problem is that the size for which these memories are used are very small. Thus they are high speed as well as have very little area. Any test mechanism put dedicatedly for these memories is normally 200% to 800% the size of the memory itself. Although there are some serial test mechanisms which use scan registers to test the memories, but then a lot of time is taken to write and read in the memory through the scan chains as the scan chains on the SOC's are very long. Also the test cannot be a very high-speed test as serial scan chains are routed for a very low speed.
9. The test logic has to ensure the correctness of general functionality from all the ports as well as the at-speed functionality. The fault models, which are particular about the dual port operation, are Disturb faults, as well as memory cell stability tests.
10. The external test mechanism thus put does not increase the observability of the system pins, which are going to the register file.
There is accordingly a need in the art to provide a memory which solves the foregoing problems and addresses the above-mentioned disadvantages.
There is a need in the art for providing a register file, FIFO solution that solves the foregoing problems and addresses the above-mentioned disadvantages.