Reference is made to FIG. 1 which shows a schematic diagram of a standard six transistor (6T) static random access memory (SRAM) cell 10. The cell 10 includes two cross-coupled CMOS inverters 12 and 14, each inverter including a series-connected p-channel and n-channel transistor pair. The inputs and outputs of the inverters 12 and 14 are coupled to form a latch circuit having an internal true latch node 16 and an internal complement latch node 18. The cell 10 further includes two transfer (passgate) transistors 20 and 22 whose gate terminals are coupled to a wordline node and are controlled by the signal present on the wordline (WL). Transistor 20 is source-drain connected between the true latch node 16 and a node associated with a true bitline (BLT). Transistor 22 is source-drain connected between the complement latch node 18 and a node associated with a complement bitline (BLC). The source terminals of the p-channel transistors in each inverter 12 and 14 are coupled to receive a high supply voltage (for example, VDD) at a high voltage node VH, while the source terminals of the n-channel transistors in each inverter 12 and 14 are coupled to receive a low supply voltage (for example, GND) at a low voltage node VL. The high voltage VDD at the node VH and the low voltage GND at the node VL comprise the power supply set of voltages for the cell 10.
In an integrated circuit including the SRAM cell 10, this power supply set of voltages may be received at pins of the integrated circuit, or may instead be generated on chip by a voltage regulator circuit which receives some other set of voltages from the pins of the chip. The power supply set of voltages at the nodes VH and VL are conventionally applied to the SRAM cell 10 at all times that the cell/integrated circuit is operational. It will be recognized that separate low voltage values at node VL may be provided for the sources of the n-channel MOS transistors in the inverters 12 and 14 while separate high voltage values at node VH may be provided for the sources of the p-channel MOS transistors in the inverters 12 and 14.
The reference above to a six transistor SRAM cell 10 of FIG. 1 for use as the data storage element is made by way of example only, it being understood to those skilled in the art that the cell 10 could alternatively comprise a different data storage element. The use of the term SRAM cell 10 will accordingly be understood to refer any suitable memory cell or date storage element, with the circuitry, functionality and operations presented herein in the exemplary context of a six transistor SRAM cell.
Reference is now made to FIG. 2 which shows a block diagram of a self-timed memory 30, for example of the static random access memory (SRAM) type using memory cells 10 shown in FIG. 1, with “w” words and “b” bits organized as a column mux of “m”. Those skilled in the art understand that self-timed memories need to support a high dynamic operating voltage range. In other words, these memories need to be functional over a wide range of supply voltages, starting from a very high operating voltage and down to a very low operating voltage. In most cases, in the low operating voltage range, it is considered acceptable if the memory achieves a lower performance (i.e., it is slower). In nominal operating voltage range, the memory needs to support a higher performance (i.e., it needs to be faster).
The memory 30 includes a first section 32 comprising a plurality of memory cells 10 arranged in a matrix/array format and which function to store data. The first section 32 includes “b” sub-sections 34 corresponding to the “b” bits per word stored by the memory. The first section 32 is arranged to store “w” words and is organized as a column mux of “m”. Thus, each of the “b” sub-sections 34 is organized in “w/m” rows with “m” columns in each row. In the first section 32, all cells 10 in a same row share a common wordline (WL) coupled to an output of a row decoder circuit 60 (well known to those skilled in the art), and all cells 10 in a same column share a common true bitline (BLT) and a common complement bitline (BLC) coupled to column circuitry 62 (which includes bitline precharge and equalization circuitry, column mux circuitry, write driver circuitry, column address decoder circuitry and input/output circuitry, each of which is well known to those skilled in the art).
Before beginning a write operation, the wordlines are driven low by the row decoder circuitry 60, and the true bitlines and complement bitlines are driven high. To write data to the first section 32, the wordline of the row selected according to the row address is driven high by the row decoder circuitry 60 and a column is selected in each sub-section 34 by the column address decoder and column mux in the column circuitry 62 based on the column address to connect the selected column's true bitline and complement bitline to the input/output circuitry (which, for example, will typically utilize bitline write drivers) At this point both the true bitline and complement bitline of the selected column in each sub-section 34 are made floating by the precharge and equalization logic in the column circuitry 62. In dependence on the bits of the dataword being written, one of the true bitline and complement bitline in the selected columns of each sub-section 34 are driven low by the bitline write driver circuitry. The bitline voltages are transferred to the corresponding internal true latch node 16 and complement latch node 18 of the memory cell 10 in the selected row and column in the sub-section 34 so as to write and store the proper data state.
The memory 30 includes a second section 46 including plurality of memory cells 10 arranged in a matrix format, but these cells do not function to store data. Indeed, these cells are only required, if desired, in order to have a regular layout of the memory array. The wordline ports of the memory cells 10 in the rows of this section are connected to the ground reference voltage (GND).
The memory 30 includes a third section 36 including a plurality of memory cells 10 arranged in a matrix format, and these cells also do not function to store data. Rather, these cells in the third section 36 are used to emulate the same load on a reference wordline (REFWL), which is coupled to a reference row decoder 64, as is present on the actual wordlines (WL) of the first section 32. In other words, the purpose of section 36 is to emulate a total load of “b*m” columns of memory cells 10 on the reference wordline REFWL.
The section 36 includes “b” sub-sections 38. Each sub-section 38 includes two rows of “m” memory cells 10. All memory cells 10 within the third section 36 either have their true bitlines and complement bitlines connected to a power supply voltage (for example, at node VH) or have them floating. The wordline ports of the memory cells 10 within one of the two rows of the first half of the total “b” sub-sections 38 (i.e., of the first “b/2” sub-sections 38) are coupled to the reference wordline signal generated by the reference row decoder circuit 64 (the reference wordline arriving in section 36 after having passed through the second section 46). This is done to emulate the same propagation delay corresponding to “b*m/2” columns on REFWL as is present on all the WL signals in propagating from row decoder 60 to the middle of section 32. Further, the REFWL signal which has thus reached at or about the center of the section 36 is twisted back and returned towards reference row decoder circuit 64. This returning REFWL signal is connected to the other of the two rows of the first half of the total “b” sub-sections 38 (i.e., of the first “b/2” sub-sections 38), eventually reaching the second section 46 (for connection to the wordline ports of the cells 10 therein) after experiencing a propagation delay corresponding to travelling across “b*m” columns —same as that experienced by the signal WL in propagating from row decoder 60 to the column farthest from the row decoder 60 at the end of section 32 in one row of memory cells. The reference wordline of the memory cells 10 in both rows within other “b/2” sub-sections 38 is coupled to a ground supply voltage (at the node VL) because these sub-sections 38 are present in the memory only for maintaining regularity and rectangular shape of the array of the memory cells 10, and so the memory cells 10 in these sub-sections 38 are deactivated permanently by connecting their wordline ports to a ground supply voltage (for example, at the node VL).
The memory 30 further includes a fourth section 40 including a plurality of write timer cells 42 and load cells 44 arranged in a matrix format: “w/m” rows and one column. The write timer cells 42 and load cells 44 each have a configuration similar to a memory cell 10 (like the SRAM cell shown in FIG. 1). The REFWL signal further propagates into fourth section 40 for connection to the wordline ports of the included timer cells 42 and load cells 44.
The timer cells 42 are essentially memory cell like elements that are built from the same devices as used by the memory cells 10 in section 32 (for example, also SRAM type cells). These timer cells 42 operate to write a selected logic value data state (for example, logic low) from the reference true bitline (REFBLT) into the internal true node “REFIT” in response to arrival of the reference wordline (REFWL) signal to the wordline ports. The data write time for this operation to be completed is indicative of the time required to write data from an actual bitline in the memory cell 10 of section 32 to the internal true latch node 16 (also referred to as the REFIT node). The change in logic state at the internal complement latch node 18 (also referred to as the REFIC node) or reference complement bitline (REFBLC) coupled thereto may be sensed to detect an end of a write cycle for the memory.
The load cells 44 are elements similar to write timer cells 42, with the difference that their reference wordline (REFWL) ports are grounded, so that they serve to match the load of actual bitlines (BLT/BLC) on REFBLT and REFBLC.
The wordlines WL generated in the row decoder circuitry 60 simply pass through this section 40 in order to reach the first section 32.
There are a total of “w/m” write timer cells 42 and load cells 44, in order to emulate same load on the reference true and complement bitlines within section 40 as is present on the true and complement bitlines within first section 32. A certain number of these “w/m” cells are timer cells 42, and the remaining are load cells 44. The internal latch nodes REFIT and REFIC of the timer cells 42 are respectively connected together in order to improve their load driving capability as well as reduce the statistical variability of write time of the internal nodes REFIT and REFIC, in turn reducing the statistical variability of the write cycle time. Thus, the write timer cells 42 are designed to store data in the latch circuitry (i.e., write a logic “0” on the true latch node 16 REFIT followed by a rising to logic “1” on the complement latch node 18 REFIC) with a write time which is substantially the same as that required for the latch circuitry of a selected actual memory cell 10 to have data written in the true and complement nodes during a write operation. The write time (i.e., rate of data storage) of the timer cells 42 is desired to be about the same as the write time of the actual internal latch nodes of the memory cells 10 so that the complement latch node (REFIC) is able to rise to a logic high level detectable by a detector circuit (such as an inverter circuit) contained within the column circuitry 62 in the same time in which a memory cell 10 with a statistically worst write time is able to have actual data written into it and its latch circuit set accordingly. Multiple write timer cells 42 with their internal latch nodes REFIT and REFIC shorted together help in improving the load driving capability of the internal nodes and reducing statistical variability of the rise time of REFIC and in turn the cycle time of the write operation (as explained above). This detection of REFIC state change is propagated by subsequent logic to generate an end of write cycle reset “WRITERST” signal which triggers in the control circuit the beginning of various internal reset events of the memory such as wordline off, bitline precharge on and write driver off to prepare the memory to receive the next command. Thus, the intention of this operation is to time the start of write cycle reset events inside the memory at an optimum time permitting a certain memory cell 10 with a statistically worst write time in section 32 to be successfully written with data corresponding to its data bit (I/O) in any write cycle.
A more detailed description of memory operation is now provided. Before any write cycle begins, all memory bitlines and the reference bitlines are precharged to logic high (VDD), all memory wordlines (WL) and the reference wordline (REFWL) are driven to logic low (GND) and the timer cells 42 are initialized in the state with REFIT storing logic “1” and REFIC storing logic “0”. At the start of a valid write operation characterized by the “clock” edge when the “chip select” signal is asserted for enabling the memory and the “write enable” signal is asserted for the write operation, a clock generator triggers the internal clock signal at the arrival of the “clock” edge (either rising or falling edge depending on the functionality of the memory). The internal clock signal triggers the following operations (more or less concurrently): a) drive a selected one of the “w/m” wordlines WL (depending on row address) to logic high; b) drive the reference wordline (REFWL) to logic high; c) turn off precharge of the reference bit lines (REFBLT, REFBLC), and turn off precharge of the bit lines (BLT, BLC) of a selected one of the “m” columns in each of “b” bits (depending on column address); d) trigger the write driver circuitry in the column circuitry 62 in each of the “b” bits to drive one of the “m” bit line pairs of the first section 32 in each bit (I/O) (depending on column address) with either logic “1-0” or logic “0-1” based on data to be written onto corresponding bit (as indicated by the input/output circuitry); and e) trigger the reference write driver circuitry of the column circuitry 62 to drive a logic “0” onto the reference bit line true REFBLT node which will eventually lead to a flip of the original data maintained at the internal true and complement nodes (REFIT, REFIC) in the write timer cells 42 (i.e., the logic “1” on REFIT would be flipped to logic “0” and the logic “0” on REFIC would be flipped to logic “1”).
The above operations in turn start the following operations (performed more or less concurrently): a) the rising of the selected wordline and the driving of logic “0-1” or logic “1-0” on to the bit line (BLT, BLC) pairs of the selected column of any bit (I/O) begins the write operation on the memory cell in selected row and selected column for each bit (in the first section 32); and b) the rising of reference wordline (REFWL) and driving of a logic “0” onto the true reference bit line “REFBLT” begins a reference write operation on the multiple write timer cells 42 of the third section 40, causing the internal true node “REFIT” to fall to logic “0” and the internal complement node “REFIC” to start rising towards logic “1”.
It will be noted that there is only a single memory cell 10 in each bit (I/O) which is being written by the true and complement bit lines (BLT, BLC), but there are multiple write timer cells 42 in parallel in a column that are being written with an opposite data by the reference true and complement bit lines (REFBLT, REFBLC). Thus, the time period required for the parallel connected latches of the timer cells 42 to change state is expected to be the same as the time required for the latch of a nominal memory cell 10 in any bit (I/O) to change state, because the multiple number of timer cells 42 acting on the internal nodes REFIT and REFIC would reduce the statistical variability of the time taken by the timer cells 42 to change state resulting in a time almost equal to that taken by a nominal memory cell 10. Thus, it will be accurate to say that the time it takes for data to be completely written onto any memory cell 10 of the section 32 is statistically much more variable than what it takes to write data onto the write timer cells 42 connected in parallel in the section 40.
In the memory of FIG. 2, both the wordline WL and reference wordline REFWL are driven by similarly sized drivers, to a full logic high, while the bit lines (either BLT or BLC depending on data to be written in any bit (I/O)) as well as the true reference bit line REFBLT are driven to full logic low, by similarly sized bitline write drivers of similar fanout. The change in logic state at the internal latch node REFIC of the write timer cells 42 generates an end of write cycle reset WRITERST signal. The generated WRITERST signal activates the control circuitry of the memory to trigger the beginning of various internal reset events of the memory such as wordline off, bitline precharge on and write driver off to prepare the memory to receive the next command. By this time, the write data of any bit (I/O) is latched by the selected memory cells 10 for the respective bits (I/Os), thus completing the write operation.
It is desirable to have the write timer cells 42 designed and their number chosen such that, in about the same time that a memory cell 10 with a statistically worst write time takes to latch data corresponding to the true and complement bitlines (BLT, BLC), on any process (P), voltage (V) and temperature (T) condition, the multiple timer cells 42 are able to latch data with the reference internal latch node REFIC rising to a level detectable by a simple detector circuit (such as an inverter) in the column circuitry 62. That way, the rising of the REFIC node can be detected by the column circuitry 62 to generate the end of write cycle reset WRITERST signal at an optimum time for performing the write operation successfully and with best (i.e., least) write cycle time. The WRITERST signal turns off the wordline WL, reference wordline REFWL, write drivers and reference write driver (in the column circuitry 62), precharges the bit lines BLT/BLC and reference bit lines REFBLT/REFBLC, and resets the write timer cells 42 and internal clock generator. A new write operation may then be initiated at the end of this write cycle.
It is accordingly important to ensure that the entire delay generated by the write self-timing logic described is tuned in order to guarantee a successful write operation.
Reference is now made to FIG. 3 which presents a timing diagram illustrating one cycle of the write operation. From FIG. 3, it can be observed that in order to design a robust memory (i.e., a memory that yields well under corner case conditions also), it is important to tune the delay period “TREFWRITE” (measuring the delay from initiation of the reference write operation to completion of state change for the reference internal node REFIC) in such a way that a write to a memory cell with a statistically worst write time equal to “TWRITE” (measuring the delay from initiation of the array write operation after delay t1 to completion of state change for the internal nodes 16 and 18) is able to be completed before the signal WRITERST is generated (and the wordline and bit lines are reset). The delay “TREFWRITE_WRRST” measures the delay between completion of state change for the reference internal node REFIC and the active state of the WRITERST signal. Thus, if “TWRITE” is the write time of a memory cell with a statistically worst write time (indicated by completion of change in the internal true and complement latch nodes 16 and 18), and if “TREFWRITE” is the write time of the “n” write timer cells 42 connected in parallel and included within the section 40, it would be ideal to have “TWRITE” and “TREFWRITE” have substantially the same value across different process (P), voltage (V) and temperature (T) conditions. The value of “TREFWRITE” should preferably be such that for any P, V and T condition, the time it takes for the write timer cell 42 to flip causing the internal complement node REFIC to rise beyond a level which is detected by a detector circuit (such as a simple inverter) to generate WRITERST signal and further terminate the write operation, is always longer than, but as close as possible to, the time TWRITE required for the completion of a write to a worst memory cell 10 with a statistically worst write time.
Reference is now made to FIG. 4 which illustrates the phases of the write operation presented in connection with the writing of a logic low to the internal true latch (REFIT) node 16 and a logic high to the internal complement latch (REFIC) node 18. The issue is how long the write cycle needs to last in order to ensure proper data retention in the latch of the memory cell.
The first phase of operation 100, referred to as the retention risk phase, starts when the internal true node 16 starts to fall and the internal complement node 18 starts to rise. The write cycle must last longer than the first phase, because the wordline WL cannot be turned off and the bitlines cannot be precharged during this first phase time period without risking leading the internal true node 16 back to logic high and the internal complement node 18 back to logic low.
The second phase of operation 102, referred to as the write consolidation phase, occurs as the internal true node 16 has fallen almost completely to logic low and the internal complement node 18 continues to rise. Again, the write cycle must last longer than the second phase, because the wordline WL cannot be turned off and the bitlines cannot be precharged during this second phase time period because doing so will cause a delay in the process of the internal complement node 18 rising since the complement bitline (BLC) is still supporting the operation for the internal complement node 18 to rise through the connected passgate transistor.
The third phase of operation 104, referred to as the read stability risk phase, occurs as the internal complement node 18 continues to rise. At this point, the wordline WL may be turned off and bitline precharge may be turned on because the passgate connected to the complement bitline (BLC) is no longer needed to support rise of the internal complement node 18. However, the write cycle must still last longer than the third phase time period, because the same wordline cannot be turned on again for a read operation on any column of that row since the internal complement node 18 has not risen to a high enough level so as to ensure sufficient drive in the pull down device of the latch. There is accordingly a stability issue.
The fourth phase of operation 106, referred to as the next read operation risk, occurs as the internal complement node 18 continues to rise. At this point a next read operation may be performed on any column in the row, except for the column on which the write operation was performed. Thus, for operations on other columns in the row on which the write operation was performed, the write cycle is complete, but for operations on the same columns in that row, the write cycle is not complete. The reason for this is because the internal complement node 18 is continuing to rise, and thus the gate connected pull down device of the latch is still weak. This leads to read current degradation on the associated column of cells requiring that the write cycle not be complete until the end of the fourth phase time period.
With respect to the phases of operation illustrated in FIG. 4, those skilled in the art will recognize that write cycle time for the memory is dictated by the rise time of the node on which a logic high value is being written. There is a need in the art to improve that write cycle time and especially address the next read operation risk.
One solution known in the prior art to improve rise time of the node on which a logic high value is being written uses circuitry to pull the corresponding bitline (the complement bitline (BLC) in the example above) to a negative voltage level instead of to logic low. The effect of the negative voltage applied to the bitline is to increase the gate drive of the passgate transistor on the true side of the latch which will lead to a corresponding faster rise in the internal latch node (internal true node 16 in the example above).
This solution has drawbacks. The solution requires the extra circuit overhead of a negative power supply and circuits to support selective application to the proper bit line during write mode. If the solution is implemented internally using a capacitance, there is a limited performance gain because to the two-step process needed to generate a negative voltage. Additionally, there is a device reliability concern with respect to this solution at higher supply voltages because of the resulting overdrive on some transistors.
Another solution is to overdrive the wordline to a voltage higher than VH. This results in the bitline supporting a rise on the internal node through the passgate transistor to a higher voltage value than earlier.
This solution has drawbacks. The solution requires the extra circuit overhead of an overdrive voltage supply (either externally or internally generated using capacitance). There is also a concern that this will introduce an instability on unselected columns whose cells also receive the overdriven wordline voltage. Additionally, there is a device reliability concern with respect to this solution at higher supply voltages due to the application of the overdrive voltage on some transistors within the circuit.
Another solution implements a modification to the cell 10 to include additional pullup devices for the latch circuitry. These additional pullup devices provide additional drive support during the write operation to pull the internal latch node towards logic high. These extra devices are actuated only for the selected row and selected columns of the memory which are subject to the write.
The solution has drawbacks. There is only a limited improvement in performance because of extra load being applied on the wordlines and bitlines by the additional pullup devices. Also, these devices contribute to an increase in capacitance on the internal latch nodes. Additionally, the use of additional pullup devices is made at the expense additional area overhead for the cell 10, and this can be a significant concern in medium to high capacity memory array configurations.
With reference once again to FIG. 4 and in particular to the fourth phase of operation 106 (next read operation risk), this waiting time after wordline turn off is required to ensure there is no degradation with respect to read current. However, this waiting time presents a significant portion of the overall write cycle time and if this time could be reduced or eliminated there would be a significant improvement in write cycle time.
To address this issue, one proposed solution is to store the data being written to the memory in a parallel manner during the write cycle in a separate storage element distinct from the addressed cell (perhaps provided within the memory input/output circuit of the column circuitry as shown in FIG. 2). This parallel write is made using a current write cycle, and the write cycle time may be said to terminate at the end of the third phase of operation 104 (the read stability risk phase).
In order to alleviate the risks associated with lower read current, an address comparison is performed during the next read cycle which immediately follows the previous write cycle. If there is a match in the asserted write and read addresses, the read operation is made from the data separate storage element of the input/output circuit (instead of from the memory array itself). Because the read is being made to the separate storage element of the input/output circuit, the reduction in read current within the addressed memory cells is immaterial. Thus, in such a case, the subsequent read may be made immediately after the completion of the third phase of operation 104 (the read stability risk phase). In all other memory access scenarios, the fourth phase of operation 106 (next read operation risk) is permitted to complete before a next read occurs.
The solution has drawbacks, and in particular is subject to a flaw which results in failure. If the wordline is turned on before the end of the fourth phase of operation 106 (the next read operation risk), with both bitlines at the high logic level (as would happen in case of a read operation on the same row right after the write operation), then no further rise in the internal latch node will occur because of the weakening of the connected pullup transistor. This is a result of a bounce created on the logically opposite internal latch node.
It is accordingly insufficient for guaranteed operation to merely compare the address for the current read cycle to the address for the immediately preceding write cycle. Rather, the comparison is logically more complex in that it should test not only for an address match for a write operation in the immediately preceding cycle, but also for an address match with a previous write operation that was performed any number of cycles before the current cycle but the fourth phase of operation 106 (next read operation risk) was not available.
This solution accordingly presents an additional drawback in that the required logic for testing the multiple comparisons is complex (for example, using a complex state machine) to take care of the situation in which, after a write operation on any row and column, multiple consecutive operations are performed on different columns of the same row prior to a read operation on the same row and same column.
A need accordingly exists in the art to address the foregoing and other problems associated with shortening the write cycle time of a self-timed static random access memory (SRAM) integrated circuit.