1. Field of the Invention
The invention pertains to a memory system having an array of memory cells connected along wordlines and bitlines, and two physically separated sets of wordline drivers for driving the wordlines (for example, a flash memory system of this type which includes an array of flash memory cells and emulates a magnetic disk drive). More specifically, the invention pertains to a method and system for determining memory cells to which data is to be written (where the cells are connected along bitlines, and wordlines driven by two physically separated sets of wordline drivers), so as to reduce the time needed to read the data from the cells.
2. Description of Related Art
It is conventional to implement a memory system as an integrated circuit which includes an array of flash memory cells (or other non-volatile memory cells) and circuitry for independently erasing selected blocks of the cells, programming selected ones of the cells (i.e., writing data to selected ones of the cells), and reading data from selected ones of the cells. FIG. 1 is a simplified block diagram of a flash memory system (flash memory system 3) which is designed to emulate a magnetic disk drive system. Although system 3 can be implemented as a single integrated circuit, it is not necessarily implemented as a single integrated circuit, and the following description of system 3 will not assume that it is an integrated circuit. Flash memory system 3 of FIG. 1 includes memory cell array 16, which comprises rows and columns of flash memory cells (each row of cells connected along a different wordline, and each column of cells connected along a different bitline). Flash memory system 3 also includes row decoder circuit (X address decoder) 12, which includes two physically separated sets of wordline drivers: a first set of wordline drivers 12A (positioned physically nearest to the bitline on the left side of array 16, which will be denoted the "first" bitline), and a second set of wordline drivers 12B (positioned physically nearest to the bitline on the right side of array 16, which will be denoted the "last" bitline).
We will refer to the wordlines of array 16 as being numbered consecutively from top to bottom of array 16, so that the wordlines are: wordline 0 (or "WL0"), wordline 1 (or "WL1"), wordline 2, . . . , wordline X-1, and wordline X (where X is an odd integer as shown in FIG. 2). Thus the wordlines include even-numbered wordlines (e.g., wordline 2) and odd-numbered wordlines (e.g., wordline 1). Of course, array 16 could alternatively be implemented with an odd number of wordlines.
FIG. 2 is a schematic diagram of an implementation of memory array 16 of FIG. 1 (which is a memory array suitable for reading and writing data in accordance with the present invention). Array 16 includes flash memory cells 14A through 14S (collectively referred to as cells 14) arranged in rows and columns. Each cell 14 is implemented by a floating-gate N-channel transistor, as shown schematically. All the cells 14 in a particular column have their drain regions connected to a common bitline (one of bitlines BL0 through BLN) and all the cells in a particular row have their control gates connected to a common wordline (one of wordlines WL0 through WLX). All of cells 14 have their sources connected to a common source line SL. Alternatively, it is possible to arrange the cells into array segments having separate source lines that can be sequentially accessed during an erase cycle (e.g., to reduce the maximum erase current).
Cells 14 of array 16 are arranged in column pairs, with the cells of each pair sharing a common source region. By way of example, the pair consisting of cells 14J and 14K has a common source region connected to the source line SL. The drain region of each cell 14 is connected to the bitline (one of BL0 through BLN) associated with the column in which the cell is located. By way of example, each of cells 14H, 14I, 14J, 14K, 14L, and 14M has its drain region connected to bitline BL1.
The wordlines of array 16 of FIG. 2 are driven by two physically separated sets of wordline drivers: a first set of wordline drivers 12A (positioned physically nearest to bitline BL0), and a second set of wordline drivers 12B (positioned physically nearest to bitline BLN). Each of the control gates of each of the cells connected along the even-numbered wordlines (wordlines WL0, WL2, . . . WLX-1) is driven by a driver circuit D within set 12A (i.e., each driver circuit D within set 12A asserts an appropriate control voltage to each such control gate). Each of the control gates of each of the cells connected along the odd-numbered wordlines (wordlines WL1, WL3, . . . WLX) is driven by a driver circuit D within set 12B (i.e., each driver circuit D within set 12B asserts an appropriate control voltage to each such control gate).
With reference again to FIG. 1, the drivers comprising set 12A are positioned along the left side of array 16 and are connected to the control gates of each of the flash memory cells of array 16 that are connected along the even-numbered wordlines of array 16, and the drivers comprising set 12B are positioned along the right side of array 16 and connected to the control gates of each of the cells connected along the odd-numbered wordlines of array 16. This arrangement of drivers 12A and 12B provides most efficient use of the area of system 3, allowing system 3 to be implemented with a smaller overall size than if all of drivers 12A and 12B were positioned on the same side of array 16.
To enable a conventional flash memory system such as system 3 to implement the present invention, it would need to be modified to include circuitry for modifying bitline addresses of a selected subset (e.g., one or more selected packets) of a set of data to be written to any selected row of the cells of array 16, in accordance with the invention (in a manner to be explained below).
For convenience throughout this disclosure, we use the following notation to describe address bits. "A(Y:Z)" denotes a set of (Y-(Z-1)) address bits, consisting of binary bits A.sub.Y, A.sub.Y, . . . A.sub.Z+1, and A.sub.Z. For example, A(8:0) denotes the following nine address bits: A.sub.8, A.sub.7, A.sub.6, A.sub.5, A.sub.4, A.sub.3, A.sub.2, A.sub.1, and A.sub.0.
With reference again to FIG. 1, memory system 3 also includes control engine (or "controller") 29, output buffer 10, input buffer 11, and host interface 4. Host interface 4 asserts data from output buffer 10 (e.g., data read from array 16) to an external device (not shown), and asserts input data from the external device to input buffer 11 (so that such input data can be written to array 16). Alternatively, where host interface 4 includes input and output data buffers, buffers 10 and 11 can be eliminated and the data buffers within interface 4 used in place of them.
Host interface 4 also includes an address buffer for receiving external address bits from the external device, and is configured to send buffered address bits (including bits identifying cylinder, head, and sector addresses) to controller 29 in response to receiving external address bits from the external device. Host interface 4 also generates control signals in response to external control signals received from the external device and asserts the control signals to controller 29.
Where the external device is a host processor having a standard DOS operating system with a PCMCIA-ATA interface (for communicating with a magnetic disk drive system), interface 4 should also comply with the PCMCIA-ATA standard so that it can communicate with the standard PCMCIA-ATA interface of the external device.
In addition to row decoder circuit 12, system 3 also includes column multiplexer (Y multiplexer) circuitry, comprising: Y-decoder circuit 13A; and one subset of Y Multiplexer circuitry for each main block of array 16 (e.g., circuit YMuxA for main block 16A, circuit YMuxB for main block 16B, and circuit YMuxH for main block 16H).
In response to receiving the above-mentioned address bits (including bits identifying cylinder, head, and sector addresses) from interface 4, control engine 29 generates translated address bits A(21:0) and address bit AX and asserts the translated address bits (and bit AX) to row decoder 12 and Y decoder circuit 13A.
Each of the cells (storage locations) of memory array circuit 16 is indexed by a row index (an "X" index determined by decoder circuit 12) and a column index (a "Y" index determined by Y decoder circuit 13A). As described with reference to FIG. 2, each column of cells of array 16 comprises "X" memory cells (where X is an integer), with each cell implemented by a single floating-gate N-channel transistor. The drains of all transistors of a column are connected to a bitline, the control gate of each of the transistors is connected to a different wordline, and the sources of the transistors are held at a source potential (which is usually ground potential for the system during a read or programming operation). Each memory cell is a nonvolatile memory cell since the transistor of each cell has a floating gate capable of semipermanent charge storage. The current drawn by each cell (i.e., by each of the N-channel transistors) depends on the amount of charge stored on the cell's floating gate. Thus, the charge stored on each floating gate determines a data value that is stored "semipermanently" in the corresponding cell. Where each of the N-channel transistors is a flash memory device, the charge stored on the floating gate of each is erasable (and thus the data value stored by each cell is erasable) by appropriately changing the voltage applied to the gate and source (in a well known manner). In memory systems comprising an array of non-volatile memory cells other than flash memory cells, such non-volatile cells are erased using other techniques which are well known.
As noted, system 3 emulates a conventional magnetic disk drive system. Accordingly, the cells of array 16 are addressed in a manner emulating the manner in which conventional magnetic disk storage locations are addressed. System 3 can be mounted on a card for insertion into a computer system. Alternatively, variations on system 3 (which lack array 16 and instead include a flash memory interface for interfacing with one or more separate memory array circuits) can be implemented as part of a card (for insertion into a computer system), where the card has a chip set mounted thereon, and the chip set includes a controller chip and several memory chips controlled by the controller chip. Each memory chip implements an array of flash memory cells.
The dominant computer operating system known as "DOS" (Disk Operating System) is essentially a software package used to manage a disk system. DOS has been developed by IBM Corporation, Microsoft Corporation, and Novell as the heart of widely used computer software. The first generation of Microsoft Corporation's "Windows" operating system software was essentially a continuation of the original DOS software with a user friendly shell added for ease of use.
The DOS software was developed to support the physical characteristics of hard drive structures, supporting file structures based on heads, cylinders and sectors. The DOS software stores and retrieves data based on these physical attributes. Magnetic hard disk drives operate by storing polarities on magnetic material. This material is able to be rewritten quickly and as often as desired. These characteristics have allowed DOS to develop a file structure that stores files at a given location which is updated by a rewrite of that location as information is changed. Essentially all locations in DOS are viewed as fixed and do not change over the life of the disk drive being used therewith, and are easily updated by rewrites of the smallest supported block of this structure. A sector (of a magnetic disk drive) is the smallest unit of storage that the DOS operating system will support. In particular, a sector has come to mean 512 bytes of information for DOS and most other operating systems in existence. DOS also uses clusters as a storage unit. Clusters, however, are nothing more than the logical grouping of sectors to form a more efficient way of storing files and tracking them with less overhead.
The development of flash memory integrated circuits has enabled a new technology to offer competition to magnetic hard drives and offer advantages and capabilities that are hard to support by disk drive characteristics and features. The low power, high ruggedness, and small sizes offered by a solid state flash memory system make such a flash memory system attractive and able to compete with a magnetic hard disk drive system. Although a memory implemented with flash memory technology may be more costly than a hard disk drive system, computers and other processing systems are being developed that require (or benefit greatly from) use of flash memory features.
Thus, flash memory systems have been developed that emulate the storage characteristics of hard disk drives. Such a flash memory system is preferably structured to support storage in 512 byte blocks along with additional storage for overhead bits associated with mass storage, such as ECC (error correction code) bits. A key to this development is to make the flash memory array respond to a host processor in a manner that looks like a disk so the operating system can store and retrieve data in a known manner and be easily integrated into a computer system including the host processor.
In some flash memory systems that emulate the storage characteristics of hard disk drives, the interface to the flash memory is identical to a conventional interface to a conventional magnetic hard disk drive. This approach has been adopted by the PCMCIA standardization committee, which has promulgated a standard for supporting flash memory systems with a hard disk drive protocol. A flash memory card (including one or more flash memory array chips) whose interface meets this standard can be plugged into a host system having a standard DOS operating system with a PCMCIA-ATA (or standard ATA) interface. Such a flash memory card is designed to match the latter standard interface, but must include an onboard controller which manages each flash memory array independent of the host system.
Since system 3 of FIG. 1 emulates a magnetic disk drive, above-mentioned address bits A(21:0) and AX determine cylinder, sector, and packet addresses of the type conventionally used in magnetic disk drive systems. In one implementation, array 16 of FIG. 1 has 544 bytes per row of flash memory cells (each byte consisting of eight bits, and each memory cell is capable of storing one bit). Each row of cells is equivalent to a magnetic disk "sector" (512 bytes of data plus 32 bytes of "overhead").
In such an implementation, array 16 is partitioned into eight large "decode" blocks (sometimes referred to as "main" blocks) of cells (schematically indicated in FIG. 1). The decode blocks are physically isolated from one another. This partitioning of blocks allows defects in one decode block to be isolated from the other decode blocks in the array, allows defective decode blocks to be bypassed by a controller, and allows for high usage of die and enhances overall yield of silicon produced (driving down the cost of flash mass storage systems).
Array 16 of FIG. 1 includes eight decode blocks (blocks 16A, 16B, 16C, 16D, 16E, 16F, 16G, and 16H, which are also referred to herein as "main blocks," and of which only blocks 16A, 16B, and 16H are shown in FIG. 1). Y-select gate circuitry is provided for each decode block of array 16. Specifically, Y-select gate circuitry YMuxA is provided for selecting columns of decode block 16A in response to indices received from circuit 13A, Y-select gate circuitry YMuxB is-provided for selecting columns of decode block 16B in response to indices received from circuit 13A, Y-select gate circuitry YMuxH is provided for selecting columns of decode block 16H in response to indices received from circuit 13A, and five other subsets of Y-select gate circuitry (not separately shown) are provided for selecting columns of the other decode blocks (blocks 16C, 16D, 16E, 16F, and 16G) in response to indices received from circuit 13A.
Each decode block is subdivided into a number (e.g., eight) of independently erasable blocks, sometimes referred to herein as "erase blocks." Each erase block consists of rows of flash memory cells, each row being capable of storing seventeen "packets" of binary bits, each packet consisting of 32 bytes (each byte consisting of eight binary bits). Thus, each row (capable of storing 544 bytes) corresponds to one conventional disk sector (comprising 544 bytes), and each row can store 512 bytes of data of interest as well as 32 ECC bytes for use in error detection and correction (or 32 "overhead" bytes of some type other than ECC bytes, or a combination of ECC bytes and other overhead bytes).
Each erase block is divided into two blocks of cells known as "cylinders" of cells (in the sense that this expression is used in a conventional magnetic disk drive), with each cylinder consisting of 256K bits of data organized into 64 sectors (i.e. 64 rows of cells). Thus, each erase block in the FIG. 2 example consists of 128 sectors (i.e., 128 rows of cells).
Each erase block can be independently erased in response to signals from controller 29. All flash memory cells in each erase block are erased at the same (or substantially the same) time, so that erasure of an erase block amounts to erasure of a large portion of array 16 at a single time.
The individual cells of array 16 of FIG. 1 are addressed by address bits A(21:0) and AX, with the three highest order address bits (A21, A20, and A19) determining the main block, the three next highest order address bits (A18, A17, and A16) determining the erase block, the next address bit (Al5) determining the cylinder, the next six address bits (A14:9) determining the sector, the next four address bits (A8:5) and address bit AX determining the packet (within the sector), and the five lowest order address bits (A4:0) determining the byte within the packet. Address bits A(21:9) are used by X decoder 12 to select the row (sector) of array 16 in which the target byte is located and the remaining nine address bits A(8:0) and address bit AX are used by Y decoder circuit 13A to select the appropriate columns of array 16 in which the target byte is located. Additional address bit AX is used for selecting a packet consisting of overhead bits (such as ECC check bits and redundancy bits). More specifically, seventeen packets are stored per sector, including sixteen packets of ordinary data (any one of which can be selected by address bits A(8:5)) and one packet of overhead bits (which can be selected by address bit AX).
In a normal operating mode, system 3 executes a write operation as follows. Interface 4 asserts appropriate ones of address bits A(21:0) and AX to decoder circuits 12 and 13A. In response to these address bits, circuit 12 determines a row address which selects one sector (row) of cells and circuit 13A determines a column address (which selects eight of the columns of memory cells of array 16). The row and column address thus together select a total of eight target cells in one selected row (for storing one byte of data). In response to a write command supplied from controller 29, a signal (indicative of an eight-bit byte of data) present at the output of input buffer 11 is asserted through the relevant Y multiplexer circuitry (e.g., through circuit YMuxH, where the data is to be written to target cells in block 16H) to the eight target cells of array 16 determined by the row and column address (e.g., to the drain of each such cell). Depending on the value of each of the eight data bits, the corresponding target cell is either programmed or it remains in an erased state.
In the normal operating mode, system 3 executes a read operation as follows. Interface 4 asserts appropriate ones of address bits A(21:0) and AX to circuits 12 and 13A. In response to these address bits, circuit 12 determines a row address which selects one row (sector) of cells, and circuit 13A determines a column address (which selects eight of the columns of memory cells of array 16). The row and column address thus together select a total of eight target cells in one selected row (for storing one byte of data). In response to a read command supplied from control unit 29, a current signal (a "data signal") indicative of a data value stored in one of the eight target cells of array 16 is supplied from the drain of each of the target cells through the bitline of the target cell and then through the relevant Y multiplexer circuitry (e.g., through circuit YMuxH, where the data is stored in cells within block 16H) to sense amplifier circuitry 33. Each data signal is processed in sense amplifier circuitry 33, buffered in output buffer 10, and finally asserted through host interface 4 to an external device.
System 3 also includes a pad (not shown) which receives a high voltage V.sub.pp from an external device, and a switch connected to this pad. During some steps of a typical erase or program sequence (in which cells of array 16 are erased or programmed), control unit 29 sends a control signal to the switch to cause the switch to close and thereby assert the high voltage V.sub.pp to various components of the system including wordline drivers within X decoder 12 (or the source line within array circuit 16.
When reading a selected cell of array 16, if the cell is in an erased state, the cell will conduct afirst current which is converted to a first voltage in sense amplifier circuitry 33. If the cell is in a programmed state, it will conduct a second current which is converted to a second voltage in sense amplifier circuitry 33. Sense amplifier circuitry 33 determines the state of the cell (i.e., whether it is programmed or erased corresponding to a binary value of 0 or 1, respectively) by comparing the voltage indicative of the cell state to a reference voltage. The outcome of this comparison is an output which is either high or low (corresponding to a digital value of one or zero) which sense amplifier circuitry 33 sends to output buffer 10.
It is important during a write operation to provide the wordline of each selected cell with the proper voltage and the drain of each selected cell with the appropriate voltage level (the voltage determined by the output of input buffer 11), in order to successfully write data to the cell without damaging the cell.
Controller 29 of system 3 controls detailed operations of system 3 such as the various individual steps necessary for carrying out programming, reading, and erasing operations. Controller 29 thus functions to reduce the overhead required of the external processor (not depicted) typically used in association with system 3.
A conventional memory system having an array of cells connected along bitlines and wordlines (e.g., a flash memory system of the type described above) has the following limitation. When one of the wordlines is selected by address bits A(21:9), the wordline driver for the selected wordline charges the wordline (by asserting a voltage to one end of the wordline). The operation of charging up the selected wordline is inherently time consuming, and (for a typical wordline) a significant amount of time is needed to charge the entire wordline. An amount of time determined by the RC (resistance and capacitance) characteristic of the wordline is inherently required. Thus, where the wordline driver is at the opposite side of the array from a selected packet to be read, a significant amount of time must elapse before that packet can be read. Where the array's wordlines are very long (e.g., where the array is designed with very long rows of cells to emulate a magnetic disk device) and the array does not include metal strapping, the wait time needed to charge the selected wordline is an especially large (and significant) portion of the overall access time for reading the selected packet. For example, where the "first" packet (whose cells are connected along bitlines BL0, BL1, etc. near the left side of array 16) is selected, and the wordline driver for the selected wordline is one of the drivers in set 12B and is thus positioned at the right side of array 16, a significant amount of time must elapse before the any of the cells in the first packet can be read. In the example set forth in the previous sentence, cells in the "last" packet of the selected wordline (the cells of the wordline which are connected along bitlines BLN, BLN-1, etc. near the right side of array 16) could be read with much less delay, since the gates of the cells in the last packet would be charged much sooner than the gates of the cells in the first packet.
It would be desirable to improve existing technology to overcome the above-described limitation, and thereby enable data (stored in selected cells of each sector) to be read more rapidly than could be read using existing technology.