This invention is in the field of integrated circuits. Embodiments of this invention are more specifically directed to solid-state static random access memories (SRAMs), and power reduction in those SRAMs.
Many modern electronic devices and systems now include substantial computational capability for controlling and managing a wide range of functions and useful applications. Many of these electronic devices and systems are now handheld portable devices. For example, many mobile devices with significant computational capability are now available in the market, including modern mobile telephone handsets such as those commonly referred to as “smartphones”, personal digital assistants (PDAs), mobile Internet devices, tablet-based personal computers, handheld scanners and data collectors, personal navigation devices, and the like. Of course, these systems and devices are battery powered in order to be mobile or handheld. The power consumption of the electronic circuitry in those devices and systems is therefore of great concern, as battery life is often a significant factor in the buying decision as well as in the utility of the device or system.
The computational power of these modern devices and systems is typically provided by one or more processor “cores”, which operate as a digital computer in carrying out its functions. As such, these processor cores generally retrieve executable instructions from memory, perform arithmetic and logical operations on digital data that are also retrieved from memory, and store the results of those operations in memory; other input and output functions for acquiring and outputting the data processed by the processor cores are of course also provided. Considering the large amount of digital data often involved in performing the complex functions of these modern devices, significant solid-state memory capacity is now commonly implemented in the electronic circuitry for these systems.
Static random access memory (SRAM) has become the memory technology of choice for much of the solid-state data storage requirements in these modern power-conscious electronic systems. As is fundamental in the art, SRAM memory cells store contents “statically”, in that the stored data state remains latched in each cell so long as power is applied to the memory; this is in contrast to “dynamic” RAM (“DRAM”), in which the data are stored as charge on solid-state capacitors, and must be periodically refreshed in order to be retained. However, SRAM cells draw DC current in order to retain their stored state. Especially as the memory sizes (in number of cells) become large, this DC current can become a substantial factor in battery-powered systems such as mobile telephones and the like.
Advances in semiconductor technology in recent years have enabled shrinking of minimum device feature sizes (e.g., MOS transistor gates) into the sub-micron range. This miniaturization is especially beneficial when applied to memory arrays, because of the large proportion of the overall chip area often devoted to on-chip memories. However, this physical scaling of device sizes does not necessarily correlate to similar scaling of device electrical characteristics. In the context of SRAM cells, the memory cell transistors at currently-available minimum feature sizes conduct substantial DC current due to sub-threshold leakage and other short channel effects. As such, the sub-micron devices now used to realize SRAM arrays have increased the DC data retention current drawn by those arrays.
Designers have recently adopted circuit-based approaches for reducing power consumed by integrated circuits including large memory arrays. One common approach is to reduce the power supply voltage applied to memory arrays, relative to the power supply voltage applied to logic circuitry and circuitry peripheral to the memory array (e.g., decoders, sense amplifiers, etc.). This approach not only reduces the power consumed by the memory array, but also helps to reduce sub-threshold leakage in the individual cells.
Another circuit-based approach to reducing power consumption involves placing the memory functions within the integrated circuit into a “retention” state when possible. In conventional memories, the power supply voltages applied to the memory array in the retention state are reduced to voltages below that necessary for access, but above the minimum required for data states to be retained in the memory cells (i.e., above the data-state retention voltage, or “DRV”); memory peripheral circuits are also powered down in this retention mode, saving additional power. Typically, both the “Vdd” power supply voltage applied to the loads of SRAM cells (e.g., the source nodes of the p-channel transistors in CMOS SRAM cells) and also well bias voltages are reduced in the retention mode. However, significant recovery time is typically involved in biasing the memory array to an operational state from the retention state.
Recently, an intermediate power-down mode has been implemented in integrated circuits with memory arrays of significant size. This intermediate mode is referred to in the art as “retain-till-accessed”, or “RTA”, and is most often used in those situations in which the memory arrays are split into multiple blocks. In the RTA mode, the peripheral memory circuitry remains fully powered and operational. However, only those block or blocks of the memory array that are being accessed are fully powered; other blocks of the memory that are not being accessed are biased to a reduced array power supply voltage (i.e., above the retention voltage) to reduce power consumption while idle. Well and junction biases (i.e., other than the bias of p-channel MOS source nodes that receive the reduced RTA bias) are typically maintained at the same voltages in RTA mode as in read/write operation, to reduce the recovery time from RTA mode. The power saving provided by the RTA mode can be substantial, especially if some of the larger memory blocks are accessed infrequently. Because of its ability to be applied to individual blocks within a larger-scale integrated circuit, as well as its fast recovery time, the RTA standby mode is now often used with embedded memories in modern mobile Internet devices and smartphones, considering that these devices remain powered-on but not fully active for much of their useful life.
From a circuit standpoint, integrated circuit memories having an RTA mode must include circuitry that establishes the reduced RTA array bias voltage, and that switchably controls entry into and exit from RTA mode during operation. FIG. 1a is a block diagram of a conventional integrated circuit 2 in which such RTA standby is provided. Integrated circuit 2 includes memory array 5, arranged into multiple memory array blocks 60 through 63 of different sizes relative to one another. Each memory array block 6 is associated with corresponding decode and read/write circuitry 11 that addresses, writes data to, and reads data from its associated memory array block 6. Integrated circuit 2 also includes functional and power management circuitry 4, which includes the logic functionality provided by integrated circuit 2, and also circuitry for regulating and distributing power supply voltages throughout integrated circuit 2. For purposes of this example of memory array 5, functional and power management circuitry 4 produces a voltage on power supply line VddHDR that is sufficient for memory read and write operations. Functional and power management circuitry 4 also produces a “periphery” power supply voltage on power supply line VddP, which is applied to decoder and read/write circuitry 11 and is typically at a different voltage from that of the power supply voltage on line VddHDR applied to memory array 5 during reads and writes, as known in the art. The actual array power supply voltage applied to each memory array block 60 through 63 is presented on power supply lines VddAR0 through VddAR3, respectively. The voltages on lines VddAR0 through VddAR3 are defined by way of bias/switch circuits 70 through 73, respectively, and based on the voltage at power supply line VddHDR, as will be described below.
Each memory array block 6 in this conventional integrated circuit 2 is constructed as an array of SRAM cells arranged in rows and columns. As shown in FIG. 1b by the example of six-transistor (6-T) memory cell 12j,k, which is in the jth row and kth column of one of memory array blocks 6, each SRAM memory cell 12 is biased between the voltage on power supply line VddAR and a reference voltage (e.g., at ground reference Vss). SRAM memory cell 12j,k in this case is constructed in the conventional manner as a pair of cross-coupled CMOS inverters, one inverter of series-connected p-channel transistor 13p and n-channel transistor 13n, and the other inverter of series-connected p-channel transistor 14p and n-channel transistor 14n; the gates of the transistors in each inverter are connected together and to the common drain node of the transistors in the other inverter, in the usual manner. N-channel pass transistors 15a, 15b have their source/drain paths connected between one of the cross-coupled nodes and a corresponding one of complementary bit lines BLk, BL*k, respectively; the gates of pass transistors 15a, 15b are driven by word line WLj for the row. Accordingly, as known in the art, DC current drawn by SRAM cell 12j,k amounts to the sum of the off-state source/drain leakage currents through one of p-channel transistors 13p, 14p and one of re-channel transistors 13n, 14n, plus any gate oxide leakage and junction leakage that may be present. As mentioned above, if transistors 13, 14 are extremely small sub-micron devices, these leakage currents can be significant (as much as 1 nA per memory cell), and can thus result in significant overall standby power consumption if the number of memory cells 12 in memory array blocks 6 is large.
Referring back to FIG. 1a, memory array blocks 60 through 63 may be independently biased into RTA mode in this conventional integrated circuit 2, by operation of bias/switch circuits 70 through 73, respectively. The construction of bias/switch circuit 71 is illustrated in FIG. 1a by way of example. P-channel transistor 8 is connected in diode fashion, with its source at power supply line VddHDR and its drain and gate connected to node VddAR1; the voltage drop across transistor 8 from the voltage at line VddHDR thus establishes voltage on power supply line VddAR1. Shorting transistor 9 is a relatively large p-channel power transistor with its source/drain path connected between power supply line VddHDR and power supply line VddAR1, and its gate receiving control signal RTA*1 from functional and power management circuitry 4. If memory array block 61 is being accessed for a read or write operation, control signal RTA*1 is driven to a low logic level, which turns on transistor 9 in bias/switch circuit 71 and shorts out diode 8, setting the voltage at line VddAR1 at that of power supply line VddHDR. Conversely, if memory array block 61 is to be placed in RTA mode, functional and power management circuitry 4 will drive control signal RTA*1 to a high logic level. This turns off transistor 9 in bias/switch circuit 71, such that the voltage drop across diode 8 establishes the voltage at node VddAR1 at a lower voltage (by one diode drop) than the voltage at power supply line VddHDR. In this RTA mode, therefore, the power consumed by memory array block 61 will be reduced by an amount corresponding to at least the square of this voltage reduction. Meanwhile in this RTA mode, periphery power supply line VddP applied to peripheral memory circuitry, such as decoder and read/write circuitry 11 for each memory array block 6, carries its normal operating voltage, so that this peripheral circuitry is ready to perform an access of its associated memory array block.
It has been observed, in connection with this invention, that it is difficult to optimize the power savings in RTA mode for memory arrays constructed in the conventional fashion. As known in the art, stored data in the SRAM may be lost if the array voltage falls below a minimum data retention bias voltage; conversely, power savings is optimized by biasing the array blocks in RTA mode at a voltage close to that minimum data retention voltage. However, it is difficult to achieve this optimization because of variations in voltage, temperature, and manufacturing parameters; selection of the size and construction of diodes 8 in the example of FIG. 1a to maximize power savings is thus a difficult proposition. In addition, it is now common practice to use different size transistors in the memory cells 12 of memory array blocks 6 of different size; these differences in device sizes create additional difficulty in establishing an optimal RTA array block bias.
It has also been observed, in connection with this invention, that RTA bias optimization is made more difficult by the manner in which conventional integrated circuits with embedded memory arrays are constructed. This conventional construction is shown by way of integrated circuit 2 of FIG. 1a, in which diodes 8 in bias/switch circuits 7 are constructed as part of “core” region 3 including functional and power management circuitry 4. In this core region 3, transistors are constructed substantially differently than the transistors in memory array 5, for example constructed with different channel lengths, different source/drain impurity concentrations via different ion implantation parameters, different gate oxide thicknesses, and the like, relative transistors in SRAM cells 12. For example, according to a conventional 28 nm CMOS manufacturing technology, memory array transistors receive such additional processing as a fluorine implant to increase the effective gate oxide thickness and reduce gate leakage, which the core transistors do not receive; other differences between core and array transistors include different “pocket” implants to implement different threshold voltages for the core and array transistors, and the use of strain engineering techniques to construct the core transistors (e.g., selectively depositing a tensile silicon nitride film over core NMOS transistors and a compressive silicon nitride film over core PMOS transistors) but not to construct the array devices. As described in U.S. patent application Publication US 2009/0258471 A1, published Oct. 15, 2009 and entitled “Application of Different Isolation Schemes for Logic and Embedded Memory”, commonly assigned with this application and incorporated herein by reference, the isolation structures and isolation doping profiles used in logic core regions of the integrated circuit may differ from those used in the memory arrays, so that tighter isolation spacing can be attained in the memory array. In summary, conventional integrated circuits often include logic core (“core”) devices that are constructed to optimize switching performance, while the array devices are constructed for low leakage and low mismatch variation. These differences in construction between transistors in core region 3 and transistors 13, 14 in memory array 5 reduce the ability of diodes 8 to match transistors 13, 14 over variations in process parameters. Additional margin must therefore be provided in selecting the construction of diodes 8 and the resulting voltage drop, to ensure that the minimum data retention voltage is satisfied, but this additional margin necessarily leads to additional standby power consumption.
As mentioned above, it is known in the art to use different size transistors to realize memory cells 12 in memory array blocks 6 of different size. Typically, memory array blocks 6 are grouped according to the number of bits (i.e., number of columns, if a common number of rows per block is enforced), with common transistor sizes based on the group. For example, thirty-two row memory array blocks 6 may be grouped into “bins” of increasing transistor size (W/L): from 16 to 128 columns; from 129 to 256 columns; from 257 to 320 columns, and from 321 to 512 columns. By way of further background, it is also known in the art to provide different size core device diodes 8 for memory array blocks 6 realized by transistors of different sizes. For example, the W/L of p-channel MOS diodes 8 may range from 1.0/0.75 (μm) for memory array blocks 6 of 16 to 128 columns, 1.5/0.065 for memory array blocks 6 of 129 to 256 columns, 2.5/0.055 for memory array blocks 6 of 257 to 320 columns, and 5.0/0.045 for memory array blocks 6 of 321 to 512 columns in size. Even according to this approach, however, it has been observed, in connection with this invention, that a large margin must still be provided for the RTA voltage, because of the wide variation in leakage with variations in power supply voltage, temperature, and process variations, as well as the variation in leakage current drawn with the number of columns in memory array blocks 6 even within a given bin. As such, while this “binning” reduces somewhat the leakage current drawn in the RTA mode, the RTA bias voltage must still be maintained well above the data retention voltage (DRV), and is thus not optimized.
Even though conventional RTA mode circuitry has greatly reduced the recovery time from RTA mode to normal operation, as compared with the recovery time from a retention or a full power-down mode, the recovery time from RTA mode remains sufficiently long as to be unacceptable in certain high performance applications. As such, many very large scale integrated circuits, such as the well-known “system on a chip” (or “SoC”) integrated circuits, include both high density SRAM memory, in which RTA mode and other power savings techniques are realized, and also high performance SRAM memory. Logic functionality in the integrated circuit determines which type of data to store in these different types of SRAM memory.
The lack of RTA mode in high performance SRAM memory comes at a substantial power dissipation penalty, even if the high performance SRAM capacity is minimized. For example, in one conventional SoC implementation constructed with submicron feature size technology, the memory density realized in high performance SRAM is about ⅓ that realized in high density SRAM. However, it has been observed that the high performance SRAM consumes as much power, in its data retention mode without RTA bias, as that consumed by all of the high density memory in its RTA mode.
By way of further background, some conventional high performance SRAM memories are now realized by way of eight transistor (“8-T”) memory cells, constructed by way of a 6-T latch as shown in FIG. 1b, in combination with a two-transistor read buffer. An example of this 8-T construction is illustrated in FIG. 1c in connection with SRAM cell 12′j,k (in row j and column k, as before). Cell 12′j,k includes the 6-T latch of transistors 13p, 13n, 14p, 14n, 15a, 15b, as described above relative to FIG. 1b. However, in cell 12′j,k, write word line WR_WLj connected to the gates of pass transistors 15a, 15b is asserted only for the jth row in write cycles, to connect storage nodes S1, S2 to complementary write bit lines WR_BLk, WR_BL*k for the kth column. In a write to cell 12′j,k, write circuitry (not shown) pulls one of write bit lines WR_BLk, WR_BL*k to ground, depending on the data state being written into cell 12′j,k. Cell 12′j,k also includes n-channel transistors 16n, 18n, which have their source-drain paths connected in series between read bit line RD_BLk and ground. Read buffer pass transistor 18n has its drain connected to read bit line RD_BLk, and its gate receiving read word line RD_WLj for row j. Read buffer driver transistor 16n has its drain connected to the source of transistor 18n and its source at ground; the gate of transistor 16n is connected to storage node S2. In a read of cell 12′j,k, read word line RD_WLj is asserted active high, which turns on buffer pass transistor 18n if the data state of storage node S2 is a “1”; in this case, read bit line RD_BLk is pulled to ground by buffer driver transistor 16n through buffer pass transistor 18n. A read of cell 12′j,k in the case in which storage node S2 is a “0” results in transistor 16n remaining off, in which case read bit line RD_BLk is not pulled down. A sense amplifier (not shown) is capable of detecting whether read bit line RD_BLk is pulled to ground by the selected cell in column k, and in turn communicates that data state to I/O circuitry as appropriate.
By way of still further background, the 8-T concept described in connection with FIG. 1c is further extended, in some conventional SRAM memories, to provide complementary read bit lines. An example of this extended structure is illustrated by way of cell 12″j,k shown in FIG. 1d. Cell 12″j,k includes the eight transistors of cell 12′j,k shown in FIG. 1c, but also includes transistors 16n′, 18n′ that forward the data state at storage node S1 to complementary read bit line RD_BL*k, in similar fashion as transistors 16n, 18n forward the state at storage node S2 to read bitline RD_BLk. In a read cycle, enabled by read word line RD_WLj driven active high, which turns on transistors 18n, 18n′, a differential signal is generated on read bit lines RD_BLk, RD_BL*k according to the states at storage nodes S2, S1. SRAM cells constructed as shown in FIG. 1d are referred to in the art as “10-T” cells.