This invention is in the field of nonvolatile semiconductor memory. Embodiments of this invention are more specifically directed to the programming of memory cells in an electrically erasable read-only memory of the flash type.
Non-volatile solid-state read/write memory devices are now commonplace in many electronic systems, particularly in portable electronic devices and systems. A common technology for realizing non-volatile solid-state memory devices, more specifically for realizing electrically erasable programmable “read-only” memory (EEPROM) devices, utilizes “floating-gate” transistors to store the data state. According to this conventional technology, the memory cell transistor is “programmed” by biasing it so that electrons tunnel through a thin dielectric film onto an electrically isolated transistor gate element. The trapped electrons on the floating gate raise the apparent threshold voltage of the memory cell transistor (for n-channel devices), as compared with the threshold voltage with no electrons trapped on the floating gate. The stored state can be read by sensing the presence or absence of source-drain conduction under bias.
Modern EEPROM devices are “erasable” in that the memory cell transistors can be biased to remove the electrons from the floating gate, by reversing the tunneling mechanism. Some EEPROM memory devices are of the “flash” type, in that a large number (a “block”) of memory cells can be simultaneously erased in a single operation. Conventional EEPROM memories can be arranged in a “NOR” fashion, which permits individual cells in each column to be separately and individually accessed. Flash EEPROM memories are also now commonly arranged as “NAND” memory, in which the source/drain paths of a group of memory cells in a column are connected in series. NAND memories can be constructed with higher density, but require all of the cells in a group to be biased to access any one of the cells in that group.
Because of the convenience and efficiency of modern flash EEPROM memories, it is now desirable and commonplace to embed EEPROM memory within larger scale integrated circuits, such as those including modern complex microprocessors, microcontrollers, digital signal processors, and other large-scale logic circuitry. Such embedded EEPROM can be used as non-volatile program memory storing software routines executable by the processor, and also as non-volatile data storage.
According to one approach, floating-gate EEPROM cells are realized by metal-oxide semiconductor (MOS) transistors having two polysilicon gate electrodes. A control gate electrode in one polysilicon level is electrically connected to decode and other circuitry in the EEPROM integrated circuit, and a floating gate in another polysilicon level is disposed between the control gate electrode and the channel region of the memory transistor. In this conventional construction, the application of a high programming voltage to the control gate capacitively couples to the floating gate, and attracts electrons from the source and drain regions of the transistor to an extent that some tunnel to, and remain trapped on, the floating gate. FIG. 1a illustrates the electrical arrangement of conventional EEPROM memory cell 2 constructed according to this double-polysilicon construction. Memory cell 2 consists essentially of a single transistor with its drain connected to bit line BL, its source at ground, and its control gate connected to word line WL. A floating gate electrode is physically disposed between the control gate and the channel region of the transistor of memory cell 2, and is thus electrically isolated from the control gate, source, and drain of the transistor. The specific physical arrangement of the floating gate relative to the other elements of memory cell 2 can vary, depending on the particular design, as known in the art.
FIG. 1b illustrates a conventional arrangement of a non-volatile memory including array 5. Array 5 includes floating-gate EEPROM memory cells 2, arranged in rows and columns. While the number of memory cells 2 in array 5 shown in FIG. 1b is very small (sixteen cells 2, in four rows and four columns), for purposes of this description, typical conventional EEPROM arrays include many more cells. Indeed, some modern non-volatile memories include thousands of memory cells 2 on a common word line. In array 5 of FIG. 1b, each row of memory cells 2 shares a common one of word lines WL0, WL1, WL2, WL3, each driven by one word line drivers 6. Each column of memory cells 2 shares a common one of bit lines BL0, BL1, BL2, BL3, each driven by one of bit line drivers 4 and coupled to one of sense amplifiers 8.
In both read and write operations, one of word lines WL0, WL1, WL2, WL3 is selected according to a row portion of an address value, and driven active by the corresponding one of word line drivers 6. As will be described in further detail below, the voltages applied in read and write operations differ. In a write operation, one or more of bit lines BL0, BL1, BL2, BL3 is selected, according to a column portion of an address value, and is driven by its corresponding one of bit line drivers 4 with the appropriate programming voltage corresponding to the data state to be written as indicated on input data lines DATA IN (i.e., whether the cell is to be programmed or not). In a read operation, bit line drivers 4 bias one or more of bit lines BL0, BL1, BL2, BL3, and sense amplifiers 8 sense the state of one or more of bit lines BL0, BL1, BL2, BL3. The particular columns from which data are to be read can be selected, in response to a column portion of the address value, by either bit line drivers 4, sense amplifiers 8, or by circuitry downstream from sense amplifiers 8. The state of the selected memory cells 2 are output from sense amplifiers 8 on lines DATA OUT.
In conventional floating-gate EEPROMs, as mentioned above, an absence of trapped electrons is the “erased” state of the memory cell, and is evident by the (n-channel) floating-gate transistor having a low threshold voltage. This state is typically considered to be a logical “1”, as drain-to-source current is conducted in response to a read voltage applied at the control gate. The “programmed” state in which electrons are trapped on the floating gate results in the floating-gate transistor having a high threshold voltage, in which source/drain current does not conduct with a read voltage applied to the control gate; this state is typically considered to be a logical “0”.
The programming of a “0” state into memory cell 2, constructed in this double-gate manner, is typically performed by the application of a high voltage at the control gate along with a relatively strong drive (voltage and current) at the drain of the floating-gate transistor of memory cell 2, with the source of the transistor at ground. For example, in one conventional technology, a programming voltage of about 9.2 volts is applied to the control gate of memory cell 2 being programmed, in combination with a voltage of about 4.2 volts to the drain of the floating-gate transistor of memory cell 2, both voltages relative to the ground level at the source of that transistor. The physical mechanism involved in the programming operation is Fowler-Nordheim tunneling of “hot” electrons from the transistor channel region through the gate dielectric and into the floating gate electrode, to which the high control gate voltage is capacitively coupled. The high voltages and relatively high currents (e.g., on the order of 150 μA/bit) required by the programming mechanism are commonly generated by on-chip charge pump circuits. Typical programming cycle times are relatively long (e.g., on the order of microseconds), and include not only the duration of the programming pulse but also significant rise and fall times for the high voltage levels. These long programming times are in sharp contrast with the relatively fast read access cycle times (e.g., below 100 nsec), and as such various memory management techniques are used to reduce the system impact of the programming cycles.
A conventional approach to reducing the system impact of the long programming times is to program multiple bits simultaneously, in a parallel programming operation. Some conventional flash EEPROM memories are capable of simultaneously programming as many as 128 bits at once. However, additional widening of the parallelism of the programming operation beyond this practical limit is believed to be practical, because of the size of the driver transistors required to drive the large programming currents, as well as the required size for the charge pumps and other support circuitry.
Another conventional approach to reducing the system effect of EEPROM programming is known in the art as “EEPROM emulation mode”. In this programming mode, the EEPROM array is paired with a static random access memory (SRAM) array. Upon power-up or on demand, the previous contents of an EEPROM block are written into the SRAM array, and the processor or other memory host modifies those contents by writing to the SRAM array (rather than directly programming cells in the EEPROM array). As a background operation or upon power-down of the integrated circuit, the EEPROM block is flash-erased, and is then programmed with the now-modified contents contained within the SRAM array. This programming operation can be performed in a “column-fast” manner, to reduce the programming overhead time, as will now be described relative to FIG. 1c in combination with FIG. 1b. 
In the example shown in FIG. 1c, row address n is applied to row address decode circuitry associated with word line driver 6. This row address n of course indicates the word line to be activated during the programming operation, and thus the row in which memory cells 2 are to be programmed in this operation. During this row address time, the input data (lines DATA_IN) are in a don't care state, word line drivers 6 are driving all word lines low (shown on line VWL in FIG. 1c), and bit line drivers 4 are driving all bit lines low (shown on line VBL in FIG. 1c). These polarities correspond to the floating-gate transistors of memory cells 2 being n-channel devices. Upon the decoding and latching of row address n, one of word line drivers 6 drives the selected word line to a high voltage VHV, which as mentioned above can be on the order of 9.2 volts. This high word line voltage VWL is applied to the control gates of all memory cells 2 controlled by the selected word line, and is capacitively coupled to the corresponding floating gates, so as to attract electrons from the channel region of any of those memory cells 2 that are biased to conduct drain-to-source current. As shown in FIG. 1c, because of the significant load presented by the selected word line and the memory cell gates connected to the word line (which, as mentioned above, can number into the thousands), the rise time of voltage VWL from a low level to the desired high voltage VHV is significant, as shown by time tVWLS in FIG. 1c. 
Column address m is presented at a point in time following the presentation and latching of row address n, as is the input data value corresponding to the desired state to be programmed into the addressed memory cell 2 (i.e., corresponding to row n, column m). After allowing for a rise time of the voltage VWL, specified as time tVWLS in FIG. 1c, and if the input data value indicates that the addressed memory cell 2 is to be programmed, one of bit line drivers 4 drives the selected bit line corresponding to column address m with a high voltage level VPPFL, for example on the order of 4.2 volts, with sufficient current drive (e.g., 150 μA) that a sufficient number of electrons tunnel into and are trapped on the floating gate of the addressed memory cell 2. This programming pulse on the selected bit line is shown in FIG. 1c as bit line voltage VBL. A minimum programming pulse width tPRG is typically specified for the duration of this high voltage VPPFL on the selected bit line. As shown in FIG. 1c, the rise time of bit line voltage VBL is generally much shorter than that of word line voltage VWL, because of the much reduced load presented by a diffused bit line than that presented by the long word line and its floating gate electrodes. After the programming pulse time tPRG (at least) elapses, the appropriate bit line driver 4 deactivates the bit line voltage VBL on the selected bit line.
In this conventional column-fast approach in which the same row address is retained for multiple programming operations, the word line voltage VWL is deactivated between column addresses. As such, following the fall time tVWLD of the word line voltage VWL, commencing with the bit line voltage VBL reaching its inactive level, a new column address m+1 (which need not be the next sequential column in the array) can then be presented and decoded, along with the input data value corresponding to that address. The same word line corresponding to row address n is then driven to its high voltage VHV, allowing a rise time tVWLS, following which the bit line corresponding to the new column address m+1 is then driven to its high level VPPFL, if the input data indicates that the cell is to be programmed.
The column-fast programming approach for EEPROM emulation mode as shown in FIG. 1c is known to greatly improve the efficiency of programming flash memory from that in a “random access” mode in which both the row and column address must be presented with each programming pulse. The programming “overhead” time is reduced by this column-fast mode to on the order of 8 μsec per address location, as opposed to 20 μsec per address location in the random access mode. However, as evident from FIG. 1c, each incremental additional programming pulse in a row requires a minimum of the word line rise time tVWLS, the programming pulse tPRG, and the word line fall time tVWLD. As such, the overall programming process in conventional EEPROMs is limited by this sum of these specified time periods, and by the limited number of bits that can be programmed in parallel.
By way of further background, access modes such as “page mode” and “extended data out (EDO)” in dynamic random access memories (DRAM) are known in the art. In these DRAM access modes, a row in the memory array is selected and remains active while multiple column addresses are decoded in sequence, to access multiple cells within the same addressed row within the single row address cycle.