1. Field of the Invention
The invention pertains to a memory system having an array of memory cells (e.g., a flash memory system which includes an array of flash memory cells and emulates a magnetic disk drive). More specifically, the invention pertains to a method and system for simultaneously selecting two or more blocks of cells of a memory cell array, so that data can be written to (or read from) the selected blocks simultaneously.
2. Description of Related Art
It is conventional to implement a memory system as an integrated circuit which includes an array of flash memory cells (or other non-volatile memory cells) and circuitry for independently erasing selected blocks of the cells, programming selected ones of the cells (i.e., writing data to selected ones of the cells), and reading data from selected ones of the cells. FIG. 1 is a simplified block diagram of a flash memory system (flash memory system 3) which is designed to emulate a magnetic disk drive system. Although system 3 can be implemented as a single integrated circuit, it is not necessarily implemented as a single integrated circuit, and the following description of system 3 will not assume that it is an integrated circuit.
As shown in FIG. 1, system 3 includes memory cell array 16 which comprises rows and columns of flash memory cells (each row of cells connected along a different wordline, and each column of cells connected along a different bitline), predecoding circuit or predecoder 49, row decoder circuit (X address decoder) 12, and Y-decoder circuit 13. Row decoder circuit 12 includes two physically separated sets of wordline drivers: a first set of wordline drivers 12A (positioned physically nearest to the bitline on the left side of array 16), and a second set of wordline drivers 12B (positioned physically nearest to the bitline on the right side of array 16).
The wordlines of array 16 will be referred to as being numbered consecutively from top to bottom of array 16, so that the wordlines are: wordline 0 (or “WLO”), wordline 1 (or “WL1”), wordline 2, . . . , wordline X−1, and wordline X (where X is an integer).
Typically, each memory cell is implemented by a floating-gate N-channel transistor. All the cells in a particular column have their drain regions connected to a common bitline (one of bitlines BL0 through BLN) and all the cells in a particular row have their control gates connected to a common wordline (one of wordlines WL0 through WLX). All of the cells have their sources connected to a common source line SL. Alternatively, it is possible to arrange the cells into array segments having separate source lines that can be sequentially accessed during an erase cycle (e.g., to reduce the maximum erase current).
The cells of array 16 are typically arranged in column pairs, with the cells of each pair sharing a common source region. The drain region of each cell is connected to the bitline (one of BL0 through BLN) associated with the column in which the cell is located.
The wordlines of array 16 are driven by two physically separated sets of wordline drivers: a first set of wordline drivers 12A (positioned physically nearest to bitline BL0 on the left side of the array), and a second set of wordline drivers 12B (positioned physically nearest to bitline BLN on the right side of the array). Each of the control gates of each of the cells connected along the even-numbered wordlines (wordlines WL0, WL2, etc.) is driven by a driver circuit within set 12A (i.e., each driver circuit within set 12A asserts an appropriate control voltage to each such control gate). Each of the control gates of each of the cells connected along the odd-numbered wordlines (wordlines WL1, WL3, etc.) is driven by a driver circuit within set 12B.
The drivers comprising set 12A are positioned along the left side of array 16 and are connected to the control gates of each of the flash memory cells of array 16 that are connected along the even-numbered wordlines of array 16, and the drivers comprising set 12B are positioned along the right side of array 16 and connected to the control gates of each of the cells connected along the odd-numbered wordlines of array 16. This arrangement of drivers 12A and 12B provides most efficient use of the area of system 3, allowing system 3 to be implemented with a smaller overall size than if all of drivers 12A and 12B were positioned on the same side of array 16.
In variations on system 3, array 16 is implemented so that each of two or more integrated circuits contains a different portion of array 16.
To enable a conventional flash memory system such as system 3 to implement the present invention, its predecoder circuit would need to be modified to become capable of asserting multiblock selection bits, so that in response to each set of multiblock selection bits, the system is capable of simultaneously selecting two or more selected blocks of cells of array 16 (in a manner to be explained below).
For convenience throughout this disclosure, we use the following notation to describe address bits. “A(Y:Z)” denotes a set of (Y−(Z−1)) address bits, consisting of binary bits AY, Ay−1, AZ+1, and Az. For example, A(8:0) denotes the following nine address bits: A8, A7, A6, A5, A4, A3, A2, A1, and A0.
With reference again to FIG. 1, memory system 3 also includes control engine (or “controller”) 29, output buffer 10, input buffer 11, and host interface 4. Host interface 4 asserts data from output buffer 10 (e.g., data read from array 16) to an external device (not shown), and asserts input data from the external device to input buffer 11 (so that such input data can be written to array 16). Alternatively, where host interface 4 includes input and output data buffers, buffers 10 and 11 can be eliminated and the data buffers within interface 4 used in place of them.
Host interface 4 also includes an address buffer for receiving external address bits from the external device, and is configured to send buffered address bits (including bits identifying cylinder, head, and sector addresses) to controller 29 in response to receiving external address bits from the external device. Host interface 4 also generates control signals in response to external control signals received from the external device and asserts the control signals to controller 29.
Where the external device is a host processor having a standard disk operating system (DOS) with a Personal Computer Memory Card International Association (PCMCIA)—AT Attachment (ATA) interface for communicating with a magnetic disk drive system, interface 4 should also comply with the PCMCIA-ATA standard so that it can communicate with the standard PCMCIA-ATA interface of the external device.
The column multiplexer (Y multiplexer) circuitry of system 3 comprises above-mentioned Y-decoder circuit 13, and one subset of Y Multiplexer circuitry for each main block of array 16 (e.g., circuit YMuxA for main block 16A, circuit YMuxB for main block 16B, and circuit YMuxJ for main block 16J).
In response to receiving the above-mentioned address bits (including bits identifying cylinder, head, and sector addresses) from interface 4, control engine 29 generates translated address bits A(22:0) and asserts the translated address bits to predecoding circuit (“predecoder”) 49. In response to the translated address bits (and to control signals from control engine 29), predecoder 49 asserts wordline and bitline selection bits to row decoder 12 and Y decoder circuit 13. In response to the selection bits (and to below-discussed address bit AX and control signals from control engine 29), circuits 12 and 13 select cells of array 16 to which data is to be written or from which data is to be read.
For example, where address bits A18, A17, and A16 determine the erase block of the target cells (and where array 16 includes eight erase blocks per main block), predecoder generates an 8-bit set of selection bits XC(7:0) (sometimes referred to as “erase block enable” bits) as follows, in response to each set of address bits A(18:16):
A18A17A16XC(7:0)0000000000100100000010010000001000110000100010000010000101001000001100100000011110000000
The single bit having value “one” in each set of selection bits XC(7:0) selects a different erase block (within a single selected main block). Bits XC(7:0) consist of XC0 which selects the first erase block, XC1 which selects the second erase block, XC2 which selects the third erase block, XC3 which selects the fourth erase block, XC4 which selects the fifth erase block, XC5 which selects the sixth erase block, XC6 which selects the seventh erase block, and XC7 which selects the eighth erase block.
Each of the cells (storage locations) of memory array circuit 16 is indexed by a row index (an “X” index determined by decoder circuit 12) and a column index (a “Y” index determined by Y decoder circuit 13). Each column of cells of array 16 comprises “X” memory cells (where X is an integer), with each cell implemented by a single floating-gate N-channel transistor.
In one embodiment in which array 16 includes ten main blocks (16A through 16J), each main block has 1024 rows of cells, each row has 4352 cells (and thus there are 4352 columns of cells), and array 16 includes a total of 4352×10,240 cells. Each column of cells is connected along a single bitline, each column comprises 10,240 cells, and circuit 33 includes a set of eight sense amplifiers provided for reading eight cells in parallel (each cell connected along a different bitline). Each bitline extends through all ten main blocks.
In variations on the embodiment described in the previous paragraph, each column of cells consists of several groups of cells (with the cells in each group being connected along a different bitline) and each bitline is entirely within a main block (no bitline extends through more than one main block). In one such variation, for example, array 16 comprises 10,240 wordlines and 10×4352=43,520 bitlines (with 1024 cells connected along each bitline, 1024 rows per main block, and 4352 cells per row). Circuit 33 can include a separate set of sense amplifiers for reading each main block of cells (e.g., eighty sense amplifiers are provided within circuit 33, of which eight sense amplifiers are used to read eight cells of each main block in parallel, each of these cells being connected along a different bitline). Alternatively, circuit 33 could include one set of sense amplifiers (e.g., eight sense amplifiers for reading eight cells in parallel, each of these cells being connected along a different bitline), and multiplexing circuitry for coupling this set of sense amplifiers to bitlines in any selected one of the main blocks.
The drains of all transistors of a column are connected to a bitline, the control gate of each of the transistors is connected to a different wordline, and the sources of the transistors are held at a source potential (which is usually ground potential for the system during a read or programming operation). Each memory cell is a nonvolatile memory cell since the transistor of each cell has a floating gate capable of semipermanent charge storage. The current drawn by each cell (i.e., by each of the N-channel transistors) depends on the amount of charge stored on the cell's floating gate. Thus, the charge stored on each floating gate determines a data value that is stored “semipermanently” in the corresponding cell. Where each of the N-channel transistors is a flash memory device, the charge stored on the floating gate of each is erasable (and thus the data value stored by each cell is erasable) by appropriately changing the voltage applied to the gate and source (in a well known manner). In memory systems comprising an array of non-volatile memory cells other than flash memory cells, such nonvolatile cells are erased using other techniques which are well known.
As noted, system 3 emulates a conventional magnetic disk drive system. Accordingly, the cells of array 16 are addressed in a manner emulating the manner in which conventional magnetic disk storage locations are addressed. System 3 can be mounted on a card for insertion into a computer system. Alternatively, variations on system 3 (which lack array 16 and instead include a flash memory interface for interfacing with one or more separate memory array circuits) can be implemented as part of a card (for insertion into a computer system), where the card has a chip set mounted thereon, and the chip set includes a controller chip and several memory chips controlled by the controller chip. Each memory chip implements an array of flash memory cells.
The dominant computer operating system known as “DOS” (Disk Operating System) is essentially a software package used to manage a disk system. DOS has been developed by IBM Corporation, Microsoft Corporation, and Novell as the heart of widely used computer software. The first generation of the “Windows”® (trademark of Microsoft Corp.) operating system software was essentially a continuation of the original DOS software with a user friendly shell added for ease of use.
The DOS software was developed to support the physical characteristics of hard drive structures, supporting file structures based on heads, cylinders and sectors. The DOS software stores and retrieves data based on these physical attributes. Magnetic hard disk drives operate by storing polarities on magnetic material. This material is able to be rewritten quickly and as often as desired. These characteristics have allowed DOS to develop a file structure that stores files at a given location which is updated by a rewrite of that location as information is changed. Essentially all locations in DOS are viewed as fixed and do not change over the life of the disk drive being used therewith, and are easily updated by rewrites of the smallest supported block of this structure. A sector (of a magnetic disk drive) is the smallest unit of storage that the DOS operating system will support. In particular, a sector has come to mean 512 bytes of information for DOS and most other operating systems in existence. DOS also uses clusters as a storage unit. Clusters, however, are nothing more than the logical grouping of sectors to form a more efficient way of storing files and tracking them with less overhead.
The development of flash memory integrated circuits has enabled a new technology to offer competition to magnetic hard drives and offer advantages and capabilities that are hard to support by disk drive characteristics and features. The low power, high ruggedness, and small sizes offered by a solid state flash memory system make such a flash memory system attractive and able to compete with a magnetic hard disk drive system. Although a memory implemented with flash memory technology may be more costly than a hard disk drive system, computers and other processing systems are being developed that require (or benefit greatly from) use of flash memory features.
Thus, flash memory systems have been developed that emulate the storage characteristics of hard disk drives. Such a flash memory system is preferably structured to support storage in 512 byte blocks along with additional storage for overhead bits associated with mass storage, such as ECC (error correction code) bits. A key to this development is to make the flash memory array respond to a host processor in a manner that looks like a disk so the operating system can store and retrieve data in a known manner and be easily integrated into a computer system including the host processor.
In some flash memory systems that emulate the storage characteristics of hard disk drives, the interface to the flash memory is identical to a conventional interface to a conventional magnetic hard disk drive. This approach has been adopted by the PCMCIA standardization committee, which has promulgated a standard for supporting flash memory systems with a hard disk drive protocol. A flash memory card (including one or more flash memory array chips) whose interface meets this standard can be plugged into a host system having a standard DOS operating system with a PCMCIA-ATA (or standard ATA) interface. Such a flash memory card is designed to match the latter standard interface, but must include an onboard controller which manages each flash memory array independent of the host system.
Since system 3 of FIG. 1 emulates a magnetic disk drive, above-mentioned address bits A(22:0) determine cylinder, sector, and packet addresses of the type conventionally used in magnetic disk drive systems. In a preferred implementation, array 16 of FIG. 1 has 544 bytes per row of flash memory cells each byte consisting of eight bits, and each memory cell is capable of storing one bit). Each row of cells is equivalent to a magnetic disk “sector” (512 bytes of data plus 32 bytes of “overhead”).
In such an implementation, array 16 is partitioned into ten large “decode” blocks (sometimes referred to as “main” blocks) of cells (schematically indicated in FIG. 1). The decode blocks are physically isolated from one another. This partitioning of blocks allows defects in one decode block to be isolated from the other decode blocks in the array, allows defective decode blocks to be bypassed by a controller, and allows for high usage of die and enhances overall yield of silicon produced (driving down the cost of flash mass storage systems).
Array 16 of FIG. 1 includes ten decode blocks (blocks 16A, 16B, 16C, 16D, 16E, 16F, 16G, 16H, 16I, and 16J, which are also referred to herein as “main blocks,” and of which only blocks 16A, 16B, and 16J are shown in FIG. 1). Y-select gate circuitry is provided for each decode block of array 16. Specifically, Y-select gate circuitry YMuxA is provided for selecting columns of decode block 16A in response to indices received from circuit 13, Y-select gate circuitry YMuxB is provided for selecting columns of decode block 16B in response to indices received from circuit 13, Y-select gate circuitry YMuxJ is provided for selecting columns of decode block 16J in response to indices received from circuit 13, and seven other subsets of Y-select gate circuitry (not separately shown) are provided for selecting columns of the other decode blocks (blocks 16C, 16D, 16E, 16F, 16G, 16H, and 16I) in response to indices received from circuit 13.
Each decode block is subdivided into a number (e.g., eight) of independently erasable blocks, sometimes referred to herein as “erase blocks.” In a preferred implementation of the FIG. 1 system, each erase block consists of rows of flash memory cells, each row being capable of storing seventeen “packets” of binary bits, each packet consisting of 32 bytes (each byte consisting of eight binary bits). Thus, each row (capable of storing 544 bytes) corresponds to one conventional disk sector (comprising 544 bytes), and each row can store 512 bytes of data of interest as well as 32 ECC bytes for use in error detection and correction (or 32 “overhead” bytes of some type other than ECC bytes, or a combination of ECC bytes and other overhead bytes).
Each erase block is divided into two blocks of cells known as “cylinders” of cells (in the sense that this expression is used in a conventional magnetic disk drive), with each cylinder consisting of 256K bits of data organized into 64 sectors (i.e. 64 rows of cells). Thus, each erase block in the preferred implementation of the FIG. 1 system consists of 128 sectors (i.e., 128 rows of cells).
Each erase block can be independently erased in response to control signals supplied from controller 29 to circuits 12 and 13. All flash memory cells in each erase block are erased at the same (or substantially the same) time, so that erasure of an erase block amounts to erasure of a large portion of array 16 at a single time.
The individual cells of array 16 of FIG. 1 are addressed by address bits A(22:0) and AX, with the four highest order address bits (A22, A21, A20, and A19) determining the main block, the three next highest order address bits (A18, A17, and A16) determining the erase block, the next address bit (A15) determining the cylinder, the next six address bits (A(14:9)) determining the sector, the next four address bits (A(8:5)) and bit AX determining the packet (within the sector), and the five lowest order address bits (A(4:0)) determining the byte within the packet. Address bits A(22:9) are used by predecoder 49 to generate selection bits which are processed by circuit 12 to select the row (sector) of array 16 in which the target byte is located, and the remaining nine address bits A(8:0) and bit AX are used by predecoder 49 to generate selection bits which are processed by Y decoder circuit 13 to select the appropriate columns of array 16 in which the target byte is located. In the preferred implementation, address bit AX is asserted (by controller 29) to predecoder 49 and is used by circuit 49 for selecting a packet consisting of overhead bits (such as ECC check bits and redundancy bits). More specifically, seventeen packets are stored per sector, including sixteen packets of ordinary data (any one of which can be selected by address bits A(8:5)) and one packet of overhead bits (which can be selected by address bit AX).
System 3 executes a write operation as follows. Control engine 29 asserts appropriate ones of address bits A(22:0) and AX to predecoder 49, and the selection bits output by predecoder 49 are asserted to decoder circuits 12 and 13. Control engine 29 also asserts appropriate control signals to other components of the system, including buffer 11 and circuits 12 and 13. In response to the selection bits, circuit 12 selects one sector (row) of cells and circuit 13 selects eight of the columns of memory cells of array 16. Address bits A(22:0) and AX thus together select a total of eight target cells in one selected row (for storing one byte of data). In response to a write command (a control signal) supplied from controller 29, a signal (indicative of an eight-bit byte of data) present at the output of input buffer 11 is asserted through the relevant Y multiplexer circuitry (e.g., through circuit YMuxJ, where the data is to be written to target cells in block 16J) to the eight target cells of array 16 determined by the row and column address (e.g., to the drain of each such cell). Depending on the value of each of the eight data bits, the corresponding target cell is either programmed or it remains in an erased state.
System 3 executes a read operation as follows. Control engine 29 asserts address bits A(22:0) and AX to predecoder 49, and the selection bits output by predecoder 49 are asserted to circuits 12 and 13. Control engine 29 also asserts appropriate control signals to other components of the system, including circuits 12 and 13. In response to the selection bits, circuit 12 selects one row (sector) of cells, and circuit 13 selects eight of the columns of memory cells of array 16. Address bits A(22:0) and AX thus together determine a total of eight target cells in one selected row (for reading one byte of data). In response to a read command (a control signal) supplied from control unit 29, a current signal (a “data signal”) indicative of a data value stored in one of the eight target cells of array 16 is supplied from the drain of each of the target cells through the bitline of the target cell and then through the relevant Y multiplexer circuitry (e.g., through circuit YMuxJ, where the data is stored in cells within block 16J) to sense amplifier circuitry 33. Each data signal is processed in sense amplifier circuitry 33, buffered in output buffer 10, and finally asserted through host interface 4 to an external device.
Circuits 12, 13, 33, and the described Y multiplexer circuitry (including the YMuxA, YMuxB, and YMuxJ circuitry) are sometimes referred to herein collectively as “array interface circuitry.”
System 3 also includes a pad (not shown) which receives a high voltage Vpp from an external device, and a switch connected to this pad. During some steps of a typical erase or program sequence (in which cells of array 16 are erased or programmed), control unit 29 sends a control signal to the switch to cause the switch to close and thereby assert the high voltage Vpp to various components of the system including wordline drivers within X decoder 12 (or the source line within array circuit 16.
When reading a selected cell of array 16, if the cell is in an erased state, the cell will conduct a first current which is converted to a first voltage in sense amplifier circuitry 33. If the cell is in a programmed state, it will conduct a second current which is converted to a second voltage in sense amplifier circuitry 33. Sense amplifier circuitry 33 determines the state of the cell (i.e., whether it is programmed or erased corresponding to a binary value of 0 or 1, respectively) by comparing the voltage indicative of the cell state to a reference voltage. The outcome of this comparison is an output which is either high or low (corresponding to a digital value of one or zero) which sense amplifier circuitry 33 sends to output buffer 10.
It is important during a write operation to provide the wordline of each selected cell with the proper voltage and the drain of each selected cell with the appropriate voltage level (the voltage determined by the output of input buffer 11), in order to successfully write data to the cell without damaging the cell.
Controller 29 of system 3 controls detailed operations of system 3 such as the various individual steps necessary for carrying out programming, reading, and erasing operations. Controller 29 thus functions to reduce the overhead required of the external processor (not depicted) typically used in association with system 3.
It would be desirable to improve existing memory system technology to allow simultaneous selection of two or more blocks of cells (e.g., erase blocks or main blocks) of a memory cell array, in an efficient and controllable manner. This would allow manipulation of data in several blocks simultaneously (i.e., writing of data to, reading of data from, or erasing of several blocks simultaneously). This capability would be particularly useful during test mode operation of a memory system (e.g., a flash memory system) in order to reduce the time required to execute typical tests of memory cells of the system.