Nowadays, most Flash memories use Channel Hot Electron Injection (CHEI) at the drain side of the memory cell, or Fowler-Nordheim Tunneling (FNT) for programming. The CHEI mechanism provides a relatively high programming speed (.about.10 .mu.s) at the expense of a high power consumption (-1 mA/bit) which limits the number of cells that can be programmed simultaneously (so-called page-mode programming) to a maximum of 8 bytes (Y. Miyawaki et al., IEEE J. Solid-State Circuits, vol. 27, p. 583, 1992). Furthermore, in order to allow a further scaling of the transistor dimensions towards 0.35 .mu.m and below, supply voltage scaling from 5V towards 3.3V and below also becomes mandatory. This supply voltage scaling is known to degrade the CHEI efficiency--and hence the corresponding programming speed--considerably, because the high power needed to trigger the CHEI can not be easily supplied on-chip from a high voltage generator or charge pumping circuit.
Fowler-Nordheim tunneling on the other hand, provides slower programming times (.about.100 .mu.s) and a low power consumption which allows larger pages (.about.4 kbit) and therefore reduces the effective programming time to 1 .mu.s/byte (T. Tanaka et al., IEEE J. Solid-State Circuits, vol. 29, p. 1366, 1994). A further improvement is, however, limited by tunnel-oxide scaling limits and by the very high voltages (.about.18V) needed on chip for FNT, both compromising device reliability and process scalability.
The recent success of Source Side Injection (SSI) as a viable alternative over FNT and CHEI for Flash programming is mainly due to its unique combination of moderate-to-low power consumption with very high programming speed at moderate voltages. A typical example of such a device relying on SSI for programming is the Applicants' High Injection Metal-Oxide-Semiconductor or HIMOS memory cell (J. Van Houdt et al., 11th IEEE Nonvolatile Semiconductor Memory Workshop, Feb. 1991; J. Van Houdt et al., IEEE Trans. Electron Devices, vol. ED-40, p. 2255, 1993). As also described in the U.S. Pat. Nos. 5,583,810 and 5,583,811, both of which are incorporated by reference, a speed-optimized implementation of the HIMOS (High Injection MOS) cell in a 0.7 .mu.m CMOS technology exhibits a 400 nanoseconds programming time while consuming only a moderate current (.about.35 .mu.A/cell) from a 5V supply. This result is obtained when biasing the device at the maximum gate current, i.e. at a control-gate (CG) voltage (V.sub.cg) of 1.5V. The corresponding cell area is in the order of 15 .mu.m.sup.2 for a 0.7-.mu.Mm embedded Flash memory technology when implemented in a contactless virtual ground array as described in U.S. patent application Ser. No. 08/694,812 filed on Aug. 9, 1996, which is hereby incorporated by reference. In terms of the feature size (i.e. the smallest dimension on chip for a given technology), this corresponds to .about.30F.sup.2 for a 0.7-.mu.m technology.
However, due to the growing demand for higher densities, also in embedded memory applications like e.g. smart-cards and embedded microcontrollers, a continuous increase in array density and the scaling of the supply voltage become mandatory. This evolution calls for more aggressive cell area scaling and for low-voltage and low-power operation. In U.S. patent application Ser. No. 08/694,812 filed on Aug. 9, 1996, a novel programming scheme is described which reduces the power consumption during the write operation considerably. Also, the used write voltages are expected to scale with the supply voltage V.sub.supply since the SSI mechanism only requires the floating gate channel to stay in the linear regime for fast programming (see e.g. J. Van Houdt et al., IEEE Trans. Electron Devices, vol. ED-40, p. 2255, 1993). Therefore, the necessary Program-Gate voltage V.sub.pg for fast programming is given by: EQU V.sub.pg .about.(V.sub.supply +V.sub.th)/p (1)
where V.sub.th is the intrinsic threshold voltage of the floating gate transistor (.about.0.5V) and p is the coupling ratio from Program Gate to Floating Gate (typically .about.50%). According to Eq. (1), V.sub.pg is thus expected to scale twice as fast as the supply voltage in a first order calculation. It can be concluded that the high programming voltage is scaling very well with the supply voltage which offers enough latitude in order for the high voltage circuitry to follow the minimum design rule.
However, another problem in further Flash memory scaling is related to the erase operation. In most cases, Fowler-Nordheim tunneling through a triangular barrier is used as the erase mechanism and this requires a high oxide field (.about.10 MV/cm). In order to establish this, the tunnel oxide has to be further scaled down if the erase voltage is to decrease with the supply voltage (as required by normal CMOS scaling rules, cf. the programming operation in the previous paragraph). Otherwise, it becomes impossible to generate and switch these high voltages on-chip. However, the tunnel oxide cannot be scaled below .about.6 nm because of retention limits and, even more important, Stress-Induced Leakage Current (SILC). The latter mechanism reduces the disturb margins of the memory cells after write/erase cycling. In U.S. patent application Ser. No. 08/694,812 filed on Aug. 9, 1996, a novel erase scheme has been presented that allows the reduction of the negative erase voltage from -12V, well below -8V down to -7V for the same erase speed by exploiting the triple gate structure of the HIMOS cell. Although this novelty offers significant advantages (a 5V reduction in voltage to be switched on-chip is considerable), it may--in the long run--lead to 2 problems:
(1) Due to SILC and retention limits, the tunnel oxide may need to stay thicker than the gate oxide under the control gate at a certain point along the scaling curve. Although the HIMOS cell is less sensitive to such a situation (the read-out current is dominated by the control-gate channel), it can give rise to scaling problems due to the decreasing control over the floating gate transistor. Ideally, the oxide under the floating gate should at least follow the scaling of the gate oxide under the control gate to keep sufficient symmetry in the cell concept. Moreover, this is also beneficial for the endurance of the device: the thinner the oxide, the more cycles can be applied (apart from the SILC issue, as discussed subsequently).
(2) The larger problem, however, is the fact that the built-in select device has to endure the still relatively high negative voltage. In a 0.7-.mu.m technology, this corresponds to -7V across a 15-17 nm oxide which is still tolerable (the associated stress is only present for a limited time equal to the erase time multiplied with the number of cycles, i.e. typically 1,000 seconds). In a 0.35-.mu.m technology, the gate oxide is only 7 nm and the negative voltage should go down to .about.-4V. At the same time, however, the bitline voltage has decreased from 5V to 3.3V due to supply voltage scaling, which means that the tunnel oxide cannot be scaled sufficiently to compensate for this reduction in erase voltages without entering the SILC regime. Increasing the bitline voltage internally on-chip is a possibility but it requires charge pumps in the column decoder (with a design complexity) and it compromises the scalability of the cell's channel length.
So, there is clearly a need for a new erase scheme which offers the possibility to scale down the erase voltages without having the SILC problem. Several solutions to this problem are given in this specification.
Other references to SSI devices that are relevant with respect to the present invention are listed below:
(1) U.S Pat. No. 5,280,446, issued Jan. 18, 1994, to Y. Y. Ma et al.
(2) U.S. Pat. No. 5,029,130, issued Jul. 2, 1991, to B. Yeh
(3) "An 18 Mb Serial Flash EEPROM for Solid-State Disk Applications", by D. J. Lee et al., paper presented at the 1994 Symposium on VLSI Circuits, tech. digest p. 59
(4) "A 5 Volt high density poly-poly erase Flash EPROM cell". by R. Kazerounian, paper presented at the 1988 International Electron Devices Meeting, tech. digest p. 436.
In contrast with the invention described below, these references all suffer from a higher processing complexity and/or the need for higher erase voltages.
Ma et al. (1) use a triple poly cell where first and second poly are etched in a stacked way. It is well-known to anyone skilled in the art that such a processing scheme introduces considerable complexity which makes it impossible to use in a.o. an embedded memory application. On the other hand, the used erase voltage is still -12V provided that the bitline is biased at 5V. In future generations, aggressive tunnel oxide scaling will be required in order not to have an increase in this negative voltage.
Yeh et al. (2) show a split gate cell with very complicated interpoly formation scheme which, again, makes this concept unsuited for embedded memory. The used erase voltage is still 15V although special processing features have been introduced specifically to enhance the interpoly conduction for efficient erasure.
The papers by Lee (3) and by Kazerounian (4) show less details on processing issues but it is clear from the disclosure that the erase voltages are on the order of 20V in order to tunnel through a polyoxide.