Semiconductors are used in integrated circuits for a wide range of applications, including personal computers, music and/or video devices, multimedia devices, digital assistants, communications devices, and so forth. In general, integrated circuits include various circuits for providing user desired functionality. Increasingly, memory arrays, once manufactured as stand alone integrated circuits such as commodity DRAM or SRAM integrated circuits, are also being integrated on the same device as other circuitry. Embedded memory is increasingly used in application specific integrated circuits (ASICs) or system on a chip (SOC) integrated circuits. These highly integrated devices may include, without limitation, processors, microprocessors, digital signal processors, and memories for temporary and permanent storage, such as embedded dynamic RAM (DRAM), static RAM (SRAM) and non-volatile storage such as EEPROM and FLASH memory.
FIG. 1 depicts a typical prior art SRAM cell 10. Cell 10 is shown having 6 transistors M1-M6. In FIG. 1, a word line WL, sometimes referred to as a row line, is shown in a horizontal orientation. A word line will be arranged across multiple cells in a memory array. A pair of complementary bit lines BL and BL_ are shown oriented in columns or vertical arrangements. Note that FIG. 1 is a circuit schematic, that the word and bit lines as drawn are exemplary illustrations and that the row and bit lines may be arranged differently. The possible positions for the various devices and the orientations of the word and bit lines may be varied as is well known in the art, without changing the operation of the cell.
Six transistors are coupled to form each SRAM cell 10. Transistors M2 and M4, which are in this non-limiting example of P-type MOSFET pull up transistors, are coupled between two complementary storage nodes Q and Q_, respectively, and a positive supply voltage (VDD). VDD may be a typical positive supply voltage such as 5 volts or 3 volts, or more typically the memory core supply may be stepped down or lowered positive supply voltage, such as 2 volts, 1 volt, or slightly more or less. In some arrangements, the positive supply for the memory cells may be lower than positive supply voltages used in other circuits fabricated in the integrated circuit. As is known in the prior art, lowering VDD voltage level to the memory is often done as a method to conserve power and to speed the operation of the memory. In memory cell 10, transistors M1 and M3 are N-type MOSFET pull down transistors, and are coupled between the storage nodes Q_ and Q, respectively, and the ground reference voltage. The SRAM memory cell 10 is coupled as a cross-coupled latch, with one inverter formed by transistors M1 and M2 and another inverter formed by transistors M3 and M4. The inverter of M3 and M4 is coupled with the gates of transistors M1 and M2 coupled together and to the output of the inverter formed by transistors M3 and M4. Similarly, transistors M3 and M4 have their commonly coupled gates coupled to the output of the inverter formed by transistors M1 and M2.
In FIG. 1, access transistors (or transfer gates) M5 and M6 are coupled to transfer data from the storage nodes Q (M5) and Q_ (M6) to the complementary bit lines BL_ (M5) and BL (M6) when the word line WL is active.
During a “read” cycle, the bit lines (BL and BL_) may be pre-charged to a first voltage, and the word line WL may then become active. Assuming one of the storage nodes (Q and Q_) is at a low voltage, the other node is (since they are complementary) then at a high voltage, and one of the bit lines BL and BL_ will be pulled low when the access transistors couple the bit lines to the cell. Typically, the remaining bit line will remain at the pre-charged level although other arrangements are possible. Because the bit lines are arranged in complementary bit line pairs, a differential voltage sense amplifier may be used to receive the data from the memory cell by sensing a small differential voltage (−ΔV) between the two bit lines for each bit line pair. The small signal differential sensing allows the sense amplifier to quickly determine the data value, without the need for one of the bit line pair to transition to a full “low” voltage level. The ΔV voltage may be, for example, 100 millivolts, 200 millivolts, or more or less. This voltage is placed on the bit line, in typical fashion, by lowering the bit line to a lower voltage when the cell is coupled to the bit line by the respective access transistor.
During a “write” cycle, the data to be stored in the SRAM cell will be placed on the bit line pair BL and BL_ prior to, or simultaneously with, the activation of the word line WL. This data will be a low level on one of the complementary pair, and thus one of the nodes Q and Q_ will be pulled down to a level low enough to override the stored data. The write voltages may be a Vdd voltage on one bit line, and the other bit line may be a lower voltage, typically around 0 volts or some similar low voltage.
FIG. 1 depicts only one memory cell 10. Typical SRAM arrays contain many thousands of such cells. These are often arranged in rows, with the word line or row lines running in a first direction and coupled to the gates of the cell access transistors (for example, M5 and M6 in FIG. 1) and the bit line pairs running in columns between the cells and coupled to the source/drain terminals of the cell access transistors. Note that the terms “row” and “column” are used herein in the circuit schematic sense, and for convenience only, in describing the cells and the word and bit lines. Memory array layout arrangements known in the prior art include folded bit lines, and a variety of other layout arrangements where bit lines and word lines are orthogonal, are parallel, or arranged in various other directions with respect to each other. In a simple case for explanatory purposes as described here, the layout will also have columnar bit lines and word lines arranged in another direction, typically horizontal and perpendicular to the columns of bit line pairs, but this is not a necessary element of a memory as meant herein and the terms “row” and “column” do not limit the various arrangements that are contemplated herein.
The memory arrays of the prior art typically include a local sense amplifier coupled to each of the bit line pairs of a segment of the array. The local sense amplifier may be a differential sensing amplifier that can sense a small voltage difference ΔV between the bit lines BL and BL_, and by amplification of the sensed small signal, form a larger voltage swing signal for transmission on a global bit line pair. Sensing of small differential voltages has several advantages. The time needed for the memory cell in a read operation to place a small differential voltage on one of the bit lines (with respect to the complementary bit line, typically set at a nominal pre-charge value such as Vdd or Vdd/2) is very short when compared to the time needed to pull the same bit line to a low voltage such as zero volts. The use of small swing differential voltages on local bit lines also enables the sensing operation to quickly sense the small voltage ΔV, and to start outputting amplified full level voltages on the global bit lines.
FIG. 2 depicts in a block diagram a typical memory arrangement 20 of the prior art. In FIG. 2, memory 20 is formed using, for example, a plurality of the SRAM cells 10 of FIG. 1 to form a memory array. In FIG. 2, each memory bank 21 of the N memory banks, including Bank_0 to Bank_N−1, comprise an array 25 of many hundreds or thousands of memory cells 10. Each array 25 has memory cells (such as, for example, cell 10 in FIG. 1) disposed at intersections of the rows and columns of array 25. The word lines (row lines) are not shown in the diagram for simplicity. A plurality of local bit line pairs 24 are arranged in columns shown vertically in the diagram. A column multiplexer (mux) 23 is provided. Using selection circuitry, mux 23 chooses a subset of the bit lines pairs 24 for a given memory access cycle to form a word width wide set of bit lines for the access. For example, if the data word width is 16 bits (0-15) in an example memory, the global bit line pairs will form 16 columns. The memory array itself may have, for example, 256 columns (16 sets of 16) arranged across the array and in this simple example, the column mux 23 will select 16 out of 256 bit line columns at a given time for a memory access, which may be a write or a read cycle. Many memories have larger arrays such as 1,024 columns, 2,048 columns, etc. and the embodiments herein are not limited by these examples.
The local bit line pairs 24 are also coupled through the column mux 23 to a write driver and local sense amplifier block 27. Block 27 provides several functions. The write driver and local sense amplifier 27 couples the global bit line pairs GBL/GBLB to the selected local bit lines. The local bit lines are typically small swing signals which will have one line at a full Vdd voltage, and a second line at a small differential voltage ΔV below Vdd, say −0.3 Volts or −0.2 Volts or similar. The global bit lines are full swing signals so that the sense amplifier senses, for a memory read, the differential voltage ΔV between the local bit line pair and amplifies that voltage using a known sense amplifier circuit to a full swing output voltage (Vdd for a “1” and zero volts for a “0”, or vice versa) on the global bit line pair GBL/GBLB. The block 27 may contain a local sense amplifier for each bit in the memory word width, so if there are 16 bits in the word width, there will be 16 global bit line pairs, and corresponding to each one of the pairs, 16 local sense amplifiers in block 27.
Block 27 in FIG. 2 must also provide write data to the local bit lines 24 from the global bit line pairs during a write cycle. The column mux 23 then will place these signals onto the appropriate pair of local bit lines BL/BL_ and that write data will then override the data stored in any of the active cells selected by the word line. To speed memory access cycles, fast page mode or sequential accessing may be done where the address decoders include, for example, incrementing circuits for providing faster accesses to sequential or blocks of locations.
Each bank of the memory 20, banks Bank_0-Bank_N−1 in FIG. 2, contains identical circuitry arranged across a plurality of global bit line pair columns. The memory may be further subdivided into sectors so that as the loads on the global bit lines increase, additional current or drive capacity may be needed to speed the signal transitions. A sector Din/Dout buffer 29 may provide additional drive strength to compensate for the large capacitive loading on the global bit line pairs. The memory 20 needs to be coupled to a data bus for outputting data and receiving write data from other circuitry. An input/output data block 31 provides buffers for driving data out and receiving input data to the memory.
As the size of the memory embedded into or fabricated in integrated circuits increases, the length and loading on the global bit lines GBL/GBLB also increases. The prior art global bit lines are full swing signals, and the need to transition these large signal lines from a low voltage level to a high voltage level on such a heavily loaded buss slows memory accesses for both read and write cycles. The need to drive the heavily loaded and increasingly long global bit lines requires additional drivers or buffers or increasing the size of existing drivers or buffers, increasing the power consumed by the memory.
FIGS. 3a and 3b depict exemplary pie charts representing the power consumed by the memory arrangements of the prior art, for example as shown in FIG. 2. In FIG. 3a, the power consumed during a write cycle is shown for portions of a typical memory implemented in a current semiconductor process technology. The cell power consumption is represented by the current used during the write in the memory cells, iwe_cell. This represents about 19% of the power consumed. The decoder power consumption is represented by the current used during the write by the decoder functions, iwe_xdec, which represent about 9% of the power consumption. The remainder of the power used during the write cycle (labeled iew_io) is attributed to the input and output circuitry including the global bit lines, and the buffers and write drivers. This represents about 72% of the power consumed. Similarly, FIG. 3b depicts the power consumed by a prior art memory device sections during a typical read cycle. As these pie charts illustrate, most of the power consumed is being used in the input/output portion of the memory array, including the bit line wiring and connections. Because power consumption by memory elements in integrated circuits is one area of power that needs reduction, particularly for integrated circuit applications that are battery powered such as cell phones, portable audio and video players, portable computers, PDAs and the like, a continuing need exists to provide memory devices that consume less power and provide faster access times.