This invention is in the field of semiconductor memory. Embodiments of this invention are more specifically directed to the sensing of stored data states in an electrically erasable read-only memory of the flash type.
Semiconductor or solid-state memory is now commonplace in many electronic systems, ranging from large-scale computers to portable electronic devices and systems. Various types of semiconductor memory are available in the marketplace, each with its own benefits rendering it useful in particular applications. For example, dynamic random access memory (DRAM) provides high capacity data storage at a low cost per bit, with each memory location individually addressable, but DRAM contents require periodic refresh and are volatile upon power down. Static RAM (SRAM) is also randomly addressable and volatile on power-down, but provides high-speed data access at a cost of reduced density relative to DRAM. Mask-programmable read-only memory (ROM) provides dense non-volatile data storage, but cannot be altered.
In recent years, non-volatile read/write solid-state memory devices have become popular, particularly in portable electronic devices and systems in which it can replace magnetic disk drive storage. A common technology for realizing non-volatile solid-state memory devices is referred to as electrically erasable programmable read-only memory (EEPROM), and utilizes floating-gate transistors to store the data state. According to this technology, a memory cell transistor is programmed (i.e., written) by biasing it so that hot carrier injection causes electrons to become trapped on an electrically isolated transistor gate element. These trapped electrons on the floating gate raise the apparent threshold voltage of the memory cell transistor (for n-channel devices), as compared with its threshold voltage with no electrons trapped on the floating gate. Typically, an erased cell (data state “1”) conducts current when the transistor is biased to a read state, while a programmed cell (data state “0”) does not conduct current at that read state because of the electrons trapped on the floating gate element. The stored state can be read by sensing the presence or absence of source-drain conduction under bias, and erased by biasing the transistor so that the floating-gate electrons tunnel to the source or drain. Some EEPROM memory devices are of the “flash” type, in that a large number (a “block”) of memory cells can be simultaneously erased in a single operation.
In any type of solid-state semiconductor memory, the sensing of data stored in a selected memory cell is a critical operation. Accurate sensing of the stored memory cell state must be maintained over varying voltage and temperature conditions, variations in manufacturing parameters, and in the presence of system noise. As a result, precision sense circuitry plays a role in determining the memory density in bits per unit “chip” area (and thus in the cost-per-bit of manufacturing the memory), because the noise margin of the sense circuitry in large part determines the minimum memory cell size. As such, many types and arrangements of sense circuitry have been developed for and implemented in solid-state memory over the years.
One particularly useful design for sense circuitry in flash memory is the balanced sense amplifier circuit described in U.S. Pat. Nos. 5,528,543 and 5,773,997, both commonly assigned with this application and both incorporated herein by this reference. According to this approach, the sense amplifier circuitry includes a differential amplifier that compares a level at one input that is defined by the state of the selected memory cell with a reference level established by a reference circuitry present at its other input. In the architecture described in these U.S. Pat. Nos. 5,528,543 and 5,773,997, sense circuitry is positioned between upper and lower blocks of a memory array, so that the sense amplifier input can be coupled to a selected memory cell (and the other input coupled to reference circuitry) in either one of the upper and lower array blocks.
FIGS. 1a and 1b illustrate a conventional flash memory sense architecture, for example utilizing the balanced sense amplifier approach described in U.S. Pat. Nos. 5,528,543 and 5,773,997. In this example, floating-gate memory cells 5 are arranged into upper and lower sectors 2U, 2L. While only one row of memory cells 5 is shown in FIG. 1a as within each of sectors 2U, 2L, it will of course be understood that each of these sectors 2U, 2L will include many such rows. Each memory cell 5 in the illustrated row of upper sector 2U has a control gate driven by a word line WLUj, and each memory cell 5 in the row of lower sector 2L has its control gate driven by word line WLLj. The drain of the floating-gate transistor of each memory cell 5 in the row is connected to a separate bit line BL from the other memory cells 5 in that row; each bit line is also connected to those memory cells 5 in the same column but in other rows in the same sector (not shown in FIG. 1a). The source nodes of the floating-gate transistors in memory cells 5 are biased to ground, as shown in FIG. 1a. 
The arrangement of FIG. 1a illustrates the sensing of a single selected data bit in this conventional memory. In this example, upper 8:1 column multiplexer 4U receives each of the eight bit lines BL coupled to the drains of memory cells 5 in the rows of upper sector 2U; similarly, lower 8:1 column multiplexer 4L receives the eight bit lines BL for memory cells 5 in lower sector 2L. Each of column multiplexers 4U, 4L receive three bits of a column address COL[2:0], the value of which selects one of the eight bit lines BL (i.e., one column) for communication to sense amplifier 8; because only one row in one of upper and lower sectors 2U, 2L is selected by a corresponding row address value, the state of a single selected memory cell 5 will be sensed by sense amplifier 8 in this example.
According to the balanced sense amplifier approach described in U.S. Pat. Nos. 5,528,543 and 5,773,997, sense amplifier 8 is a differential amplifier with a positive input receiving a level corresponding to current conducted by the selected memory cell 5, and a negative input receiving a reference level. In this example, switching network 9 connects the bit line selected by one of multiplexers 4U, 4L and a reference line to the appropriate inputs of amplifier 8. In the example of FIG. 1a, switching network 9 includes switches 3U, 3L that connect one of corresponding sense output lines SENS_U, SENS_L from multiplexers 4U, 4L respectively, to the positive input of amplifier 8, and switches 7U, 7L that connect one of corresponding reference current lines BAL_U, BAL_L, respectively, to the negative input of amplifier 8. Switches 3, 7 are controlled by the state of the row address bit that indicates whether a row in upper sector 2U or in lower sector 2L is selected (e.g., address bit ROW[m] in this example).
Other sense arrangements can also be used in this architecture. For example, the bit line BL from an unselected sector 2U, 2L may itself present a load to amplifier 8 that serves as the sensing reference, perhaps with “dummy” cell coupled to that bit line BL (e.g., conducting half of the full-state current of memory cell 5), to establish the sensing reference. These and other reference or dummy arrangements are well-known in the art. In any case, switches 7U, 7L serve to connect to the negative input of amplifier 8 to a sensing reference from the corresponding sector 2U, 2L not containing the selected row.
In the example of the read operation shown in FIG. 1a, the selected row resides in upper sector 2U. Word line WLUj for this selected row is driven by a row decoder (not shown) to a high voltage (e.g. 4.5 volts) and this high voltage is applied to the control gates of all memory cells 5 in that row. The word lines for unselected rows, including the corresponding word line WLLj in lower sector 2L that has the same row address as selected word line WLUj except for row address bit ROW[m], are driven with a low voltage (e.g., ground) by the row decoder. This low word line voltage ensures that memory cells 5 in unselected rows do not conduct any current relative to their bit lines BL. A bias voltage is applied to all bit lines BL (by circuitry not shown in FIG. 1a), for example as described in U.S. Pat. Nos. 5,528,543 and 5,773,997. The current drawn by the memory cell 5 in the selected column is reflected in the current forwarded to the positive input of sense amplifier 8 via sense output line SENS_U and switch 3U. Meanwhile, because of the state of row address bit ROW[m], switches 3L and 7U are open and switch 7L is closed, coupling the reference current established by circuitry within column multiplexer 4L (see U.S. Pat. Nos. 5,528,543 and 5,773,997) to the negative input of sense amplifier 8. Sense amplifier 8 thus issues a logic level on line DATA corresponding to the state of the selected memory cell 5 in row j and column k of upper sector 2U.
FIG. 1b illustrates the construction of flash memory 10 using the sense architecture described above relative to FIG. 1a. In this example, four upper sectors 20U through 23U and four lower sectors 20L through 23L are provided, separated by column decoders 4U, 4L, and balanced sense circuitry 8 in the manner shown in FIG. 1a. Upper row decoder 12U drives word lines across upper sectors 20U through 23U, and lower row decoder 12L drives word lines across lower sectors 20L through 23L. Balanced sense circuitry 8 (specifically switching circuitry 9 therein) receives row address bit ROW[m] so that it connects the selected column bit lines BL to the corresponding differential amplifiers, as described above. The number of data bits output (in a read cycle) on output data lines DATA depends on the number of columns selected by a single value of the column address.
The arrangement of FIG. 1b is somewhat constraining, however. First of all, an even number of sectors 2 is required, considering that each upper sector 2U must have a counterpart lower sector 2L, and vice versa, for balanced sensing. In addition, each sector 2 must have the same number of rows as any other sector. Typically, the number of columns in each sector 2 is also uniform across memory 10, to facilitate column decoding. Indeed, this uniformity in sector size is typically also reflected in the architecture of the flash erase circuitry (not shown), in that each sector 2 corresponds to the smallest unit of erase in flash memory 10 (i.e., all memory cells 5 in a given sector are erased in a single erase operation). In some applications, such as for flash memory devices utilized for mass storage (e.g., flash memory cards for cameras, music players, mobile telephones, or disk drive replacement), and the like, these limitations are acceptable.
However, in many applications, it has been discovered that the limitations presented by the architecture of flash memory 10 of FIGS. 1a and 1b can be too constraining for efficient flash memory implementation and utilization. Those limitations are especially constraining in embedded flash memory within a larger-scale integrated circuit, such as a microprocessor, digital signal processor, or other large scale logic device. In such embedded applications, it is often useful to have sectors (i.e., smallest erase blocks) of varying sizes, so that individual sectors can be optimized for a particular function that is called upon by the logic circuitry in the device within which the flash memory is embedded. It is not cost efficient to dedicate an overly-large flash memory block to a particular logic circuit function, because of the chip area unnecessarily consumed by the flash memory cells that will seldom if ever be written. In addition, the requirement of an even number of sectors may waste the chip area of an additional sector that will be under-utilized in practice.
FIGS. 2a and 2b illustrate flash memory 20 according to another conventional architecture. In this approach, as shown generally in FIG. 2a, three sectors 220 through 222 are realized, all served by row decoder 23. Sectors 220 through 223 have different numbers of rows, as evident from their different sizes as shown in FIG. 2a, and each is split into left and right sector halves (e.g., sector 220 is realized as left sector half 220L and right sector half 220R), with row decoder 23 disposed between those halves. Each sector 22j, including each of its halves 22jL, 22jR, have their bit lines forwarded to final stage column decodes 24, which itself is split into left and right halves 24L, 24R. Sense circuitry portions 180, 181 have inputs receiving output lines from final stage column decodes 24, and provide an output word of multiple bit width (e.g., ranging to as many as 64 or 128 bits wide or wider, depending on the organization) in response to each address.
In operation, each row corresponding to a particular row address value extends across both halves 22jL, 22jR of the sector 22j; in this arrangement, only one row in all the sectors is selected. While the least significant bits of the applied column address (in this case, the two least significant bits CA[1:0]) are applied to final stage column decoders 24L, 24R to define the final selection of bit lines to be forwarded to sense circuitry 18, another column address bit (in this example, CA[2]) indicates whether the selected bit lines reside in the left sector half 220L, 221L, 222L containing the addressed row, or in the right sector half 220R, 221R, 222R. That column address bit is communicated both to row decoder 23, and also to column decodes 24L, 24R. In this arrangement, this column address bit controls switching circuitry (e.g., contained within final stage column decodes 24L, 24R) so that the selected bit lines are forwarded from the selected sector half to one input of each of the differential amplifiers in sense circuitry 180, 181, and so that bit lines of unselected sector halves (constituting “dummy” bit lines serving as a capacitive load for establishing a reference level) are forwarded to the other inputs of those differential amplifiers. Each column decoder 24L, 24R presents an output line to an input of its differential amplifier 18.
FIG. 2b illustrates the connection and operation of sense circuitry relative to a selected row in the arrangement of FIG. 2a, in further detail. In the example shown in FIG. 2a, the selected columns reside in left sector half 22jL and not in right sector half 22jR. As such, left final stage decoder 24L will couple selected bit lines to sense circuitry 180,181, and right final stage decoder 24R will couple dummy bit lines to sense circuitry 180,181. Left final stage decoder 24L is constructed as a bank of 4:1 multiplexers 250L through 25127L in this example, each receiving four bit lines from associated columns in sector 22jL; similarly, right final stage decoder 24R is constructed from 4:1 multiplexers  250R through 25127R, each receiving four bit lines from associated columns in sector 22jR. For example, 4:1 multiplexer 250L receives bit lines from columns c0 through c3 in sector 22j, multiplexer 251L receives bit lines from columns c4 through c7, multiplexer 250R receives bit lines from columns c512 through c515, and so on until multiplexer 25127R receives bit lines from columns c1020 through c1023 (there being 1024 columns in each sector of this example). Each of these multiplexers 25 select one of its four bit lines in response to decoded signals based on the least significant bits CA[1:0] of the column address.
Each multiplexer 25k presents an output line that is coupled to an input of a corresponding differential amplifier 28k. This output line corresponds to either a selected bit line, or to a dummy (or reference) output line, depending on the state of column address bit CA[2] in this example; logic (not shown) is provided within sense circuitry 180, 181 to resolve the proper logic output level, also depending on the state of column address bit CA[2]. Flash memory 20 presents a 128-bit output word in each read operation, and as such sense circuitry 18 includes 128 differential amplifiers 280 through 28127. In this example, in which the selected columns reside in left sector half 24, the outputs of multiplexer 250L through multiplexer 25127L present levels corresponding to the contents of selected memory cells to one input of respective differential amplifiers 280 through differential amplifier 28127; conversely, the outputs of multiplexers 250R through multiplexer 25127R present a reference level to the other input of each of differential amplifiers 280 through 28127, respectively. Differential amplifiers 280 through 28127 present output levels corresponding to the sensed contents of the selected memory cells, with the correct polarity of those contents resolved by logic within sense circuitry 180, 181 depending on column address bit CA[2], in this example.
The arrangement of FIG. 2b overcomes some of the limitations of the conventional architecture of FIGS. 1a and 1b, in that the number of sectors 22 is not constrained to an even number, and in that the sizes of the sectors 22 can vary from one another. However, the chip area required to route the large number of interconnects between multiplexers 25 in final stage column decodes 24L, 24R and the inputs of differential amplifiers 28 is substantial. In particular, virtually every one of these interconnects must travel in both the x and y directions, with some of these interconnects traveling horizontally (in the view of FIG. 2b) a substantial distance. This change in direction typically requires two physical levels of conductors in order for these lines to cross over one another. Indeed, this routing is even more problematic than appears from FIG. 2b, considering that a dummy bit line from each of multiplexers 25kL in left final stage column decoder 24L is also coupled to the negative inputs of amplifiers 280 through 28127, and that a sense output line from each of multiplexers 25kR in right final stage column decoder 24R is also coupled to the positive inputs of amplifiers 280 through 28127.
Not only do these interconnections occupy substantial chip area and possibly result in increased process complexity, but the electrical performance of a memory constructed in this manner is degraded as a result. The long interconnections that must be routed according to this implementation necessarily insert significant parasitic impedances (resistance, inductance, capacitance) into the critical sense path, especially as the cross-sectional area of the conductive elements is reduced as much as possible in order to save chip area. In addition, as evident from FIG. 2b, direct interconnection paths to the inputs of each differential amplifier are of substantially different length. For example, in FIG. 2b, the distance to be traveled by the sense signal from multiplexer 250L to differential amplifier 280 is relatively short, but the distance traveled by the reference signal from multiplexer 250R is much longer. In practice, to avoid sense amplifier imbalance because of these differences in interconnection path length, each of the interconnections to the differential amplifier inputs is extended across the full length of the array so that the parasitic loads are somewhat equivalent from line to line. Of course, this requires additional chip area for this extended interconnect routing.