Reference is made to FIG. 1 which shows a schematic diagram of a standard six transistor (6T) static random access memory (SRAM) cell 10. The cell 10 includes two cross-coupled CMOS inverters 12 and 14, each inverter including a series connected p-channel and n-channel transistor pair. The inputs and outputs of the inverters 12 and 14 are coupled to form a latch circuit having a true node 16 and a complement node 18. The cell 10 further includes two transfer (passgate) transistors 20 and 22 whose gate terminals are coupled with a wordline node and are controlled by the signal present at the wordline node (WL). Transistor 20 is source-drain connected between the true node 16 and a node associated with a true bitline (BLT). Transistor 22 is source-drain connected between the complement node 18 and a node associated with a complement bitline (BLC). The source terminals of the p-channel transistors in each inverter 12 and 14 are coupled to receive a high supply voltage (for example, VDD) at a high voltage node VH, while the source terminals of the n-channel transistors in each inverter 12 and 14 are coupled to receive a low supply voltage (for example, GND) at a low voltage node VL. The high voltage VDD at the node VH and the low voltage GND at the node VL comprise the power supply set of voltages for the cell 10.
In an integrated circuit including the SRAM cell 10, this power supply set of voltages may be received at pins of the integrated circuit, or may instead be generated on chip by a voltage regulator circuit which receives some other set of voltages from the pins of the chip. The power supply set of voltages at the nodes VH and VL are conventionally applied to the SRAM cell 10 at all times that the cell/integrated circuit is operational. It will be recognized that separate low voltage values at node VL may be provided for the sources of the n-channel MOS transistors in the inverters 12 and 14 while separate high voltage values at node VH may be provided for the sources of the p-channel MOS transistors in the inverters 12 and 14.
The reference above to a six transistor SRAM cell 10 of FIG. 1 for use as the data storage element is made by way of example only, it being understood to those skilled in the art that the cell 10 could alternatively comprise a different data storage element. The use of the term SRAM cell 10 will accordingly be understood to refer any suitable memory cell or date storage element, with the circuitry, functionality and operations presented herein in the exemplary context of a six transistor SRAM cell.
Reference is now made to FIG. 2 which shows a block diagram of a self-timed memory 30, for example of the static random access memory (SRAM) type using memory cells 10, with “w” words and “b” bits organized as a column mux of “m”. Those skilled in the art understand that self-timed memories need to support a high dynamic operating voltage range. In other words, these memories need to be functional over a wide range of supply voltages, starting from a very high operating voltage and down to a very low operating voltage. In most cases, in the low operating voltage range, it is considered acceptable if the memory achieves a lower performance (i.e., it is slower). In nominal operating voltage range, the memory needs to support a higher performance (i.e., it needs to be faster).
The memory 30 includes a first section 32 comprising a plurality of memory (such as SRAM) cells 10 arranged in a matrix format and which function to store data. The first section 32 includes “b” sub-sections 34 corresponding to the “b” bits per word stored by the memory. The first section 32 is arranged to store “w” words and is organized as a column mux of “m”. Thus, each of the “b” sub-sections 34 is organized in “w/m” rows with “m” columns in each row. In the first section 32, all cells 10 in a same row share a common wordline (WL) coupled to an output of a row decoder circuit 60 (well known to those skilled in the art), and all cells 10 in a same column share a common true bitline (BLT) and a common complement bitline (BLC) coupled to column circuitry 62 (which includes bitline precharge and equalization circuitry, column mux circuitry, write driver circuitry, column address decoder circuitry and input/output circuitry, each of which is well known to those skilled in the art).
To write data to the first section 32, the wordline of the row selected according to the row address is driven high by the row decoder circuitry 60, a column is selected in each sub-section 34 by the column address decoder and column mux in the column circuitry 62 based on the column address to connect the selected column's true bitline and complement bitline to the input/output circuitry (which, for example, will typically utilize bitline write drivers), and both the true bitline and complement bitline of the selected column in each sub-section 34 are made floating by the precharge and equalization logic in the column circuitry 62. One of the true bitline and complement bitline discharges in each sub-section 34 depending on the output of the bitline write driver circuitry, and the bitline voltages are transferred to the corresponding internal true node 16 and complement node 18 of the memory cell 10 in the selected row and column in the sub-section 34 so as to write and store the proper data state.
The memory 30 includes a second section 46 including plurality of memory cells 10 arranged in a matrix format, but these cells do not function to store data. Indeed, these cells are only required, if desired, in order to have a regular layout of the memory array. The wordline ports of the memory cells 10 both the rows in this section are connected to the ground reference voltage (GND).
The memory 30 includes a third section 36 including plurality of memory cells 10 arranged in a matrix format, and these cells also do not function to store data. Rather, these cells in the third section 36 are used to emulate the same load on a reference wordline (REFWL), which is coupled to a reference row decoder 64 within the section 36, as is present on the actual wordlines (WL) of the first section 32. In other words, the purpose of section 36 is to emulate a total load of “b*m” columns of memory cells 10 on the reference wordline REFWL. It will be noted that the REFWL signal generated by the reference row decoder circuit 64 passes through the second section 46 to the third section 36 without being connected to cells 10 included in section 46.
The section 36 includes “b” sub-sections 38. Each sub-section 38 includes two rows of “m” memory cells 10. All memory cells 10 within the third section 36 either have their true bitlines and complement bitlines connected to a power supply voltage (for example, at node VH) or have them floating. The wordline ports of the memory cells 10 within one of the two rows of the first half of the total “b” sub-sections 38 (i.e., of the first “b/2” sub-sections 38) are coupled to the reference wordline signal generated by the reference row decoder circuit 64 and arriving in section 36 after having passed through the second section 46. This is done to emulate the same propagation delay corresponding to “b*m/2” columns on REFWL as is present on all the WL signals in propagating from row decoder 60 to the middle of section 32. Further, the REFWL signal which has thus reached at or about the center of the section 36 is twisted back and returned towards reference row decoder circuit 64. This returning REFWL signal is connected to the other of the two rows of the first half of the total “b” sub-sections 38 (i.e., of the first “b/2” sub-sections 38), eventually reaching the second section 46 again after experiencing a propagation delay corresponding to travelling across “b*m” columns—same as that experienced by the signal WL in propagating from row decoder 60 to the column farthest from the row decoder 60 at the end of section 32. The reference wordline of the memory cells 10 in both rows within other “b/2” sub-sections 38 (i.e., later “b/2” sub-sections 38) is coupled to a ground supply voltage (at the node VL) because these sub-sections 38 are present in the memory only for maintaining regularity and rectangular shape of the array of the memory cells 10, and so the memory cells 10 in these sub-sections 38 are deactivated permanently by connecting their wordline ports to a ground supply voltage (for example, at the node VL).
The memory 30 further includes a fourth section 40 including a plurality of write timer cells 42 and load cells 44 arranged in a matrix format: “w/m” rows and one column. The write timer cells 42 and load cells 44 each have a configuration similar to a memory cell 10 (like the SRAM cell shown in FIG. 1).
The timer cells 42 are essentially memory (for example, SRAM) cell like elements that are built from the same devices as used by the memory cells 10 in section 32. These timer cells 42 operate to write a logic low “0” data state from the reference true bitline (REFBLT) and reference complement bit line (REFBLC), in response to arrival of a reference wordline (REFWL) signal, into the internal true node “REFIT” (with the data write time being indicative of time required to write data from an actual bitline in the memory cell 10 of section 32). The load cells 44 are elements similar to write timer cells 42, with the difference that their reference wordline (REFWL) ports are grounded, so that they serve to match the load of actual bitlines (BLT/BLC) on REFBLT and REFBLC. The wordlines WL generated in the row decoder circuitry 60 simply pass through this section 40 in order to reach the first section 32.
There are a total of “w/m” write timer cells 42 and load cells 44, in order to emulate same load on the reference true and complement bitlines within section 40 as is present on the true and complement bitlines within first section 32. A certain number of these “w/m” cells are timer cells 42, and the remaining are load cells 44. The internal nodes REFIT and REFIC of the timer cells 42 are connected together in order to improve their load driving capability as well as reduce the statistical variability of write time of the internal nodes REFIT and REFIC, in turn reducing the statistical variability of the write cycle time. Thus, the write timer cells 42 are designed to store data in the latch circuitry (i.e., write a logic “0” on the true node REFIT followed by a rising to logic “1” on the complement node REFIC) with a write time which is substantially the same as that required for the latch circuitry of a selected actual memory cell 10 to have data written in the true and complement nodes during a write operation. The write time (i.e., rate of data storage) of the timer cells 42 is desired to be about the same as the write time of the actual internal latch nodes of the memory cells 10 so that the complement node (REFIC) is able to rise to a logic high level detectable by a detector circuit (such as an inverter circuit) contained within the column circuitry 62 in the same time in which a memory cell 10 with a statistically worst write time is able to have actual data written into it and its latch circuit set accordingly. Multiple write timer cells 42 with their internal nodes REFIT and REFIC shorted together help in improving the load driving capability of the internal nodes and reducing statistical variability of the rise time of REFIC and in turn the cycle time of the write operation (as explained above). This detection of REFIC state change is propagated by subsequent logic to generate an end of write cycle reset “WRITERST” signal which triggers the beginning of various internal reset events of the memory such as wordline off, bitline precharge on and write driver off to prepare the memory to receive the next command. Thus, the intention of this operation is to time the start of write cycle reset events inside the memory at an optimum time permitting a certain memory cell 10 with a statistically worst write time in section 32 to be successfully written with data corresponding to its data bit (I/O) in any write cycle.
A more detailed description of memory operation is now provided. Before any write cycle begins, all memory bitlines and the reference bitlines are precharged to logic high (VDD), all memory wordlines (WL) and the reference wordline (REFWL) are driven to logic low (GND) and the timer cells 42 are initialized in the state with REFIT storing logic “1” and REFIC storing logic “0”. At the start of a valid write operation characterized by the “clock” edge when the “chip select” signal is asserted for enabling the memory and the “write enable” signal is asserted for the write operation, a clock generator triggers the internal clock signal at the arrival of the “clock” edge (either rising or falling edge depending on the functionality of the memory). The internal clock signal triggers the following operations (more or less concurrently): a) drive a selected one of the “w/m” wordlines WL (depending on row address) to logic high; b) drive the reference wordline (REFWL) to logic high; c) turn off precharge of the reference bit lines (REFBLT, REFBLC), and turn off precharge of the bit lines (BLT, BLC) of a selected one of the “m” columns in each of “b” bits (depending on column address); d) trigger the write driver circuitry in the column circuitry 62 in each of the “b” bits to drive one of the “m” bit line pairs of the first section 32 in each bit (I/O) (depending on column address) with either logic “1-0” or logic “0-1” based on data to be written onto corresponding bit (as indicated by the input/output circuitry); and e) trigger the reference write driver circuitry of the column circuitry 62 to drive a logic “0” onto the reference bit line true REFBLT node which will eventually lead to a flip of the original data maintained at the internal true and complement nodes (REFIT, REFIC) in the write timer cells 42 (i.e., the logic “1” on REFIT would be flipped to logic “0” and the logic “0” on REFIC would be flipped to logic “1”).
The above operations in turn start the following operations (performed more or less concurrently): a) the rising of the selected wordline and the driving of logic “0-1” or logic “1-0” on to the bit line (BLT, BLC) pairs of the selected column of any bit (I/O) begins the write operation on the memory cell in selected row and selected column for each bit (in the first section 32); and b) the rising of reference wordline (REFWL) and driving of a logic “0” onto the true reference bit line “REFBLT” begins a reference write operation on the multiple write timer cells 42 of the third section 40, causing the internal true node “REFIT” to fall to logic “0” and the internal complement node “REFIC” to start rising towards logic “1”.
It will be noted that there is only a single memory cell 10 in each bit (I/O) which is being written by the true and complement bit lines (BLT, BLC), but there are multiple write timer cells 42 in parallel in a column that are being written with an opposite data by the reference true and complement bit lines (REFBLT, REFBLC). Thus, the time period required for the parallel connected latches of the timer cells 42 to change state is expected to be the same as the time required for the latch of a nominal memory cell 10 in any bit (I/O) to change state, because the multiple number of timer cells 42 acting on the internal nodes REFIT and REFIC would reduce the statistical variability of the time taken by the timer cells 42 to change state resulting in a time almost equal to that taken by a nominal memory cell 10. Thus, it will be accurate to say that the time it takes for data to be completely written onto any memory cell 10 of the section 32 is statistically much more variable than what it takes to write data onto the write timer cells 42 connected in parallel in the section 40.
In the memory of FIG. 2, as per the prior art, both the wordline WL and reference wordline REFWL are driven by similarly sized drivers, to a full logic high, while the bit lines (either BLT or BLC depending on data to be written in any bit (I/O)) as well as the true reference bit line REFBLT are driven to full logic low, by similarly sized bitline write drivers of similar fanout. The change in logic state at the internal node REFIC of the write timer cells 42 generates an end of write cycle reset WRITERST signal. The generated WRITERST signal activates the control circuitry of the memory to trigger the beginning of various internal reset events of the memory such as wordline off, bitline precharge on and write driver off to prepare the memory to receive the next command. By this time, the write data of any bit (I/O) is latched by the selected memory cells 10 for the respective bits (I/Os), thus completing the write operation.
It is desirable to have the write timer cells 42 designed and their number chosen such that, in about the same time that a memory cell 10 with a statistically worst write time takes to latch data corresponding to the true and complement bitlines (BLT, BLC), on any process (P), voltage (V) and temperature (T) condition, the multiple timer cells 42 are able to latch data with the reference internal node REFIC rising to a level detectable by a simple detector circuit (such as an inverter) in the column circuitry 62. That way, the rising of the REFIC node can be detected by the column circuitry 62 to generate the end of write cycle reset WRITERST signal at an optimum time for performing the write operation successfully and with best (i.e., least) write cycle time. The WRITERST signal turns off the wordline WL, reference wordline REFWL, write drivers and reference write driver (in the column circuitry 62), precharges the bit lines BLT/BLC and reference bit lines REFBLT/REFBLC, and resets the write timer cells 42 and internal clock generator. A new write operation may then be initiated.
Reference is now made to FIG. 3 which presents a timing diagram illustrating the write operation. From FIG. 3, it can be observed that in order to design a robust memory (i.e., a memory that yields well under corner case conditions also), it is important to tune the delay period “TREFWRITE” (measuring the delay from initiation of the reference write operation to completion of state change for the reference internal node REFIC) in such a way that a write to a memory cell with a statistically worst write time equal to “TWRITE” (measuring the delay from initiation of the array write operation to completion of state change for the internal nodes 16 and 18) is able to be completed before the signal WRITERST is generated (and the wordline and bit lines are reset). The delay “TREFWRITE_WRRST” measures the delay between completion of state change for the reference internal node REFIC and the active state of the WRITERST signal. Thus, if “TWRITE” is the write time of a memory cell with a statistically worst write time (indicated by completion of change in the internal true and complement nodes 16 and 18), and if “TREFWRITE” is the write time of the “n” write timer cells 42 connected in parallel and included within the section 40, it would be ideal to have “TWRITE” and “TREFWRITE” have substantially the same value across different process (P), voltage (V) and temperature (T) conditions. The value of “TREFWRITE” should preferably be such that for any P, V and T condition, the time it takes for the write timer cell 42 to flip causing the internal complement node REFIC to rise beyond a level which is detected by a detector circuit (such as a simple inverter) to generate WRITERST signal and further terminate the write operation, is always longer than, but as close as possible to, the time TWRITE required for the completion of a write to a worst memory cell 10 with a statistically worst write time.
But, the issue is that the statistically worst write time (TWRITE, which is to be qualified in the design) for a worst case memory cell is primarily a function of the pull up device in the memory cell 10 whose threshold voltage “vtpu” is much higher than the threshold voltage of the pullup device of a nominal memory cell 10. However, the time it takes for the write timer cells 42 to change state is a function of the presence of multiple write timer cells in the section 40, which results in a reduction of their statistical variability (i.e., the standard deviation of the equivalent threshold voltage of “n” MOSFET devices connected in parallel equals “1/√n*sigma” of the threshold voltage of a single device, where “sigma” is the standard deviation of a single memory cell 10). Thus, with “n” MOSFET devices connected in parallel, the overall threshold voltage within the section 40 for the timer cells 42 is nearly the same as that of a pullup device having the nominal threshold voltage, and hence not representative of the memory cell 10 including a pullup device with statistically worst threshold voltage and hence the worst write time. Because of this difference in threshold voltages between the pullup device of memory cell 10 with statistically worst write time and the equivalent threshold voltage of parallel connected pullup devices of multiple timer cells 42, the voltage scaling characteristics of their write times are very different.
To address this issue, the prior art teaches the selective coupling of capacitive loads on the internal reference complement node REFIC as a function of operating voltage. For example, different capacitive loads, having capacitance values of C1 and C2 (where C1>C2) are connected to the internal reference complement node REFIC through two passgates whose control terminals receive a control signal (say LV). The passgates respond to the control signal, whose logic value is a function of operating voltage, by connecting the C1 load to REFIC when a lower operating voltage is used, and conversely connecting the C2 load to REFIC when a higher operating voltage is used.
The prior art further teaches a solution which instead selectively couples different logic delays with respect to the propagation of the WRITERST signal as a function of operating voltage. For example, different timing delays, having values of D1 and D2 (where D1>D2) are connected between the generation of the WRITERST signal and its control over downstream reset operations. The delays are selectable in response to a control signal (LV), whose logic value is a function of operating voltage, by connecting the D1 delay when a lower operating voltage is used, and conversely connecting the D2 delay when a higher operating voltage is used.
Limitations of these prior art solutions include: a more complicated system design resulting from having to generate and process the low voltage control signal input (LV) based on operating voltage; and voltage scaling of memory write cycle time that is not seamless across the entire operating voltage range because there will be an abrupt change in write cycle performance when the operating voltage changes across the threshold point for low voltage control signal (LV) and the low voltage control signal transitions in response thereto.
More generally speaking, the prior art solutions consider two options.
In a first option, the designer may decide the capacitance on the internal complement node REFIC and the logic delay subsequent to asserting a change of state in the timer cells 42 based on the required “wordline pulse width” for a statistically worst memory cell 10 with respect to write time, at the minimum required memory functional operating voltage, and live with the same setting on other voltages within the desired range of operating voltages.
In a second option, the designer may decide the parameters as in the first option at multiple voltage points for change in state of the internal nodes across the required memory functional voltage range, and then tune those parameters to achieve a required “wordline pulse width” at the lowest voltage point of each decided voltage range and control the selection of a respective setting with control pins (LV) required to be asserted/deasserted corresponding to the voltage range of operation at any point of time.
The reason for the difference in voltage characteristics mentioned above is that the current of a mosfet as a function of operating voltage is such that the change in current is much greater with change in voltage when the operating voltage of the memory is nearer to the transistor threshold voltage, as compared to when operating voltage of the memory is much higher than the transistor threshold voltage. Consider an operating voltage that is near the threshold voltage and the design of the write timer cells 42 so as to ensure qualification of the memory with respect to worst memory cell write time by, in accordance with the prior art techniques discussed above, slowing down operation of the write timer cells (for example, by loading the internal complement node REFIC with some capacitance) or alternatively introducing a logic delay with respect to the signal WRITERST. If the write timer cells 42 are then operated at a voltage that is much higher than the threshold voltage (nominal operation), those skilled in the art will recognize that the same write timer cells 42 would lead to a much slower write time for the timer cells at a higher voltage leading to allowing for a much higher write time requirement than necessary for the write cell with a statistically worst write time. In other words, the performance (write cycle time) will be sub-optimal at higher operating voltages. The amount of extra margin that is undesirably introduced at those higher operating voltages increases as the designed-to minimum operating voltage is lowered.
A need exists in the art to address the foregoing problems with respect to self-timed memory operation over a wide range of supply voltages. Such a memory will support optimum write cycle time in the nominal (higher) voltage range required during high frequency operations while still remaining functional for write with a lower operating voltage, without any control signal requirements from the system, even though operating frequency may be lower at the lower operating voltage.