This invention pertains to the field of semiconductor non-volatile data storage system architectures and their methods of operation, and has application to data storage systems based on flash electrically erasable and programmable read-only memories (EEPROMs) and other types of memory system.
A common application of flash EEPROM devices is as a mass data storage subsystem for electronic devices. Such subsystems are commonly implemented as either removable memory cards that can be inserted into multiple host systems or as non-removable embedded storage within the host system. In both implementations, the subsystem includes one or more flash devices and often a subsystem controller.
Flash EEPROM devices are composed of one or more arrays of transistor cells, each cell capable of non-volatile storage of one or more bits of data. Thus flash memory does not require power to retain the data programmed therein. Once programmed however, a cell must be erased before it can be reprogrammed with a new data value. These arrays of cells are partitioned into groups to provide for efficient implementation of read, program and erase functions. A typical flash memory architecture for mass storage arranges large groups of cells into erasable blocks, wherein a block contains the smallest number of cells (unit of erase) that are erasable at one time.
In one commercial form, each block contains enough cells to store one sector of user data plus some overhead data related to the user data and/or to the block in which it is stored. The amount of user data included in a sector is the standard 512 bytes in one class of such memory systems but can be of some other size. Because the isolation of individual blocks of cells from one another that is required to make them individually erasable takes space on the integrated circuit chip, another class of flash memories makes the blocks significantly larger so there is less space required for such isolation. But since it is also desired to handle user data in much smaller sectors, each large block is often further partitioned into individually addressable pages that are the basic unit for reading and programming user data; although the size of a write page need not be the same as the size of a read page, in the following they are treated as being the same in order to simplify the discussion. Each page usually stores one sector of user data, but a page may store a partial sector or multiple sectors. A “sector” is used herein to refer to an amount of user data that is transferred to and from the host as a unit.
The subsystem controller in a large block system performs a number of functions including the translation between logical addresses (LBAs) received by the memory sub-system from a host, and physical block numbers (PBNs) and page addresses within the memory cell array. This translation often involves use of intermediate terms for a logical block number (LBN) and logical page. The controller also manages the low level flash circuit operation through a series of commands that it issues to the flash memory devices via an interface bus. Another function the controller performs is to maintain the integrity of data stored to the subsystem through various means, such as by using an error correction code (ECC).
FIG. 1 shows a typical internal architecture for a flash memory device 131. The primary features include an input/output (I/O) bus 411 and control signals 412 to interface to an external controller, a memory control circuit 450 to control internal memory operations with registers for command, address and status signals. One or more arrays 400 of flash EEPROM cells are included, each array having its own row decoder (XDEC) 401 and column decoder (YDEC) 402, a group of sense amplifiers and program control circuitry (SA/PROG) 454 and a data register 404. Presently, the memory cells usually include one or more conductive floating gates as storage elements but other long term electron charge storage elements may be used instead. The memory cell array may be operated with two levels of charge defined for each storage element to therefore store one bit of data with each element. Alternatively, more than two storage states may be defined for each storage element, in which case more than one bit of data is stored in each element.
If desired, a plurality of arrays 400, together with related X decoders, Y decoders, program/verified circuitry, data registers, and the like are provided, for example as taught by U.S. Pat. No. 5,890,192, issued Mar. 30, 1999, and assigned to SanDisk Corporation, the assignee of this application, which is hereby incorporated by this reference. Related memory system features are described in co-pending patent application Ser. No. 09/505,555, filed Feb. 17, 2000 by Kevin Conley et al., which application is expressly incorporated herein by this reference.
The external interface I/O bus 411 and control signals 412 can include the following:
CS—Chip Select.Used to activate flash memory interface.RS—Read Strobe.Used to indicate the I/O bus is being used totransfer data from the memory array.WS—Write Strobe.Used to indicate the I/O bus is being used totransfer data to the memory array.AS—Address Strobe.Indicates that the I/O bus is being used totransfer address information.AD[7:0]—Address/This I/O bus is used to transfer data betweenData Buscontroller and the flash memory command,address and data registers of the memory control450.
In addition to these signals, it is also typical that the memory have a means by which the storage subsystem controller may determine that the memory is busy performing some task. Such means could include a dedicated signal or a status bit in an internal memory register that is accessible while the memory is busy.
This interface is given only as an example as other signal configurations can be used to give the same functionality. FIG. 1 shows only one flash memory array 400 with its related components, but a multiplicity of such arrays can exist on a single flash memory chip that share a common interface and memory control circuitry but have separate XDEC 401, YDEC 402, SA/PROG 454 and DATA REG 404 circuitry in order to allow parallel read and program operations. More generally, there may be one or two additional such data registers typically arranged into the sort of master slave arrangements developed further in U.S. Pat. No. 6,560,143, which is hereby incorporated by reference. Another arrangement for a flash memory architecture using multiple data buffers is described in U.S. Pat. No. 5,822,245.
Data is transferred from the memory array through the data register 404 to an external controller via the data registers' coupling to the I/O bus AD[7:0] 411. The data register 404 is also coupled with/to the sense amplifier/programming circuit 454. The data registers 404 can similarly be connected/coupled to the same sense amplifier/programming circuit 454. The number of elements of the data register coupled to each sense amplifier/programming circuit element may depend on the number of bits stored in each storage element of the memory cells, flash EEPROM cells each containing one or more floating gates as the storage elements. Each storage element may store a plurality of bits, such as 2 or 4, if the memory cells are operated in a multi-state mode. Alternatively, the memory cells may be operated in a binary mode to store one bit of data per storage element.
The row decoder 401 decodes row addresses for the array 400 in order to select the physical page to be accessed. The row decoder 401 receives row addresses via internal row address lines 419 from the memory control logic 450. A column decoder 402 receives column addresses via internal column address lines 429 from the memory control logic 450.
FIG. 2 shows an architecture of a typical non-volatile data storage system, in this case employing flash memory cells as the storage media. In one form, this system is encapsulated within a removable card having an electrical connector extending along one side to provide the host interface when inserted into a receptacle of a host. Alternatively, the system of FIG. 2 may be embedded into a host system in the form of a permanently installed embedded circuit or otherwise. The system utilizes a single controller 101 that performs high-level host and memory control functions. The flash memory media is composed of one or more flash memory devices, each such device often formed on its own integrated circuit chip. The system controller and the flash memory are connected by a bus 121 that allows the controller 101 to load command, address, and transfer data to and from the flash memory array. (The bus 121 includes 412 and 411 of FIG. 1.) The controller 101 interfaces with a host system (not shown) with which user data is transferred to and from the flash memory array. In the case where the system of FIG. 2 is included in a card, the host interface includes a mating plug and socket assembly (not shown) on the card and host equipment. Alternatively, there are removable cards, such as in the xD, SmartMedia, or MemoryStick formats, that lack a controller and contain only Flash Memory devices, so that the host system includes the controller 301, which interfaces the card via Flash Media Interface 302.
The controller 101 receives a command from the host to read or write one or more sectors of user data starting at a particular logical address. This address may or may not align with the first physical page in a block of memory cells.
In some prior art systems having large capacity memory cell blocks that are divided into multiple pages, the data from a block that is not being updated needs to be copied from the original block to a new block that also contains the new, updated data being written by the host. In other prior art systems, flags are recorded with the user data in pages and are used to indicate that pages of data in the original block that are being superseded by the newly written data are invalid. A mechanism by which data that partially supersedes data stored in an existing block can be written without either copying unchanged data from the existing block or programming flags to pages that have been previously programmed is described in co-pending patent application “Partial Block Data Programming and Reading Operations in a Non-Volatile Memory”, Ser. No. 09/766,436, filed Jan. 19, 2001 by Kevin Conley, which application is expressly incorporated herein by this reference.
Non-volatile memory systems of this type are being applied to a number of applications, particularly when packaged in an enclosed card that is removable connected with a host system. Current commercial memory card formats include that of the Personal Computer Memory Card International Association (PCMCIA), CompactFlash (CF), MultiMediaCard (MMC), MemoryStick-Pro, xD-Picture Card, SmartMedia and Secure Digital (SD). One supplier of these cards is SanDisk Corporation, assignee of this application. Host systems with which such cards are used include personal computers, notebook computers, hand held computing devices, cameras, audio reproducing devices, and the like. Flash EEPROM systems are also utilized as bulk mass storage embedded in host systems.
Such non-volatile memory systems include one or more arrays of floating-gate memory cells and a system controller. The controller manages communication with the host system and operation of the memory cell array to store and retrieve user data. The memory cells are grouped together into blocks of cells, a block of cells being the smallest grouping of cells that are simultaneously erasable. Prior to writing data into one or more blocks of cells, those blocks of cells are erased. User data are typically transferred between the host and memory array in sectors. A sector of user data can be any amount that is convenient to handle, preferably less than the capacity of the memory block, often being equal to the standard disk drive sector size, 512 bytes. In one commercial architecture, the memory system block is sized to store one sector of user data plus overhead data, the overhead data including information such as an error correction code (ECC) for the user data stored in the block, a history of use of the block, defects and other physical information of the memory cell block. Various implementations of this type of non-volatile memory system are described in the following United States patents and pending applications assigned to SanDisk Corporation, each of which is incorporated herein in its entirety by this reference: U.S. Pat. Nos. 5,172,338, 5,602,987, 5,315,541, 5,200,959, 5,270,979, 5,428,621, 5,663,901, 5,532,962, 5,430,859 and 5,712,180, and application Ser. No. 08/910,947, filed Aug. 7, 1997, and Ser. No. 09/343,328, filed Jun. 30, 1999. Another type of non-volatile memory system utilizes a larger memory cell block size that stores multiple sectors of user data.
One architecture of the memory cell array conveniently forms a block from one or two rows of memory cells that are within a sub-array or other unit of cells and which share a common erase gate. U.S. Pat. Nos. 5,677,872 and 5,712,179 of SanDisk Corporation, which are incorporated herein in their entirety, give examples of this architecture. Although it is currently most common to store one bit of data in each floating gate cell by defining only two programmed threshold levels, the trend is to store more than one bit of data in each cell by establishing more than two floating-gate transistor threshold ranges. A memory system that stores two bits of data per floating gate (four threshold level ranges or states) is currently available, with three bits per cell (eight threshold level ranges or states) and four bits per cell (sixteen threshold level ranges) being contemplated for future systems. Of course, the number of memory cells required to store a sector of data goes down as the number of bits stored in each cell goes up. This trend, combined with a scaling of the array resulting from improvements in cell structure and general semiconductor processing, makes it practical to form a memory cell block in a segmented portion of a row of cells. The block structure can also be formed to enable selection of operation of each of the memory cells in two states (one data bit per cell) or in some multiple such as four states (two data bits per cell), as described in SanDisk Corporation U.S. Pat. No. 5,930,167, which is incorporated herein in its entirety by this reference.
In addition to increasing the capacity of such non-volatile memories, there is a search to also improve such memories by increasing their performance and decreasing their susceptibility to error. Memories such as those described above that utilize large block management techniques perform a number of data management of techniques on the memory's file system, including garbage collection, in order to use the memory area more effectively. Such garbage collection schemes involve a data relocation process including reading data from one (or more) locations in the memory and re-writing it into another memory location. (In addition to many of the above incorporated references, garbage collection is discussed further in, for example, “A 125-mm2 1-Gb NAND Flash Memory With 10-MByte/s Program Speed”, by K. Imamiya, et al., IEEE Journal of Solid-State Circuits, Vol. 37, No. 11, November 2002, pp. 1493-1501, which is hereby incorporated in its entirety by this reference.) This data relocation time is a main contributor to all garbage collection routines. Prior art methods describe the data relocation operation as a consecutive data read, then data integrity check and error correction, if necessary, before writing the data to a new location, so that there is a high constant performance penalty of data transfer and verification. In the case of data error, additional time must be spent to correct the data before write.
Other prior art methods exploit an on-chip copy feature, writing the data from one location to another without a pre-check of the data integrity. Such a method is described, for example, in “High Performance 1-Gb NAND Flash Memory With 0.12 μm Technology”, by J. Lee, et al., IEEE Journal of Solid-State Circuits, Vol. 37, No. 11, November 2002, pp. 1502-1509, which is hereby incorporated in its entirety by this reference. The integrity check is done concurrently with the data write so that, in the case of error, there is a high probability of the need to rewrite the entire block with a high penalty in performance and time-out/latency.
An example of a simple copy sequence in the prior art, where the data is checked/corrected before being reprogrammed, is shown in FIG. 3. This shows a first set of data (DATA 1) sequentially being read from memory 400 into data register 404 (R), then the read of the buffer by the controller (RB), the data being checked and any errors corrected (EC) in the controller, the writing the checked/corrected data from the buffer (WB) back to the register 404, from where it is programmed (Program) back into the memory array 400. After the entire process is complete for DATA 1, the same steps are sequentially repeated for the next data set DATA 2, followed by DATA 3 and so on. For each set of data, the entire process is completed before it begins for the subsequent data set so that the all of the error correction times accumulate.
An example of the timing for data relocation where the data is read from the memory array 400 into the register 404, and then read to the buffer in the controller and concurrently programmed directly back into the memory is described in U.S. Pat. No. 6,266,273, which is hereby incorporated by reference. This simple copy sequence, but now with the data checked after the start of programming, is shown in FIG. 4. As shown there, after reading the data set to the register (R), it is then both read into the controller's buffer (RB) and written back to the memory array (Program). Once the data set is buffered in the controller, it can then be checked/corrected for error (E); however, even though there will now be a corrected set of data in the controller that can be supplied to the host, if there are errors to correct, these errors are written back to the memory as programming has begun before the data set has been checked and corrected. As with the process of FIG. 3, the entire process of FIG. 4 has to be completed each set of data before it can begin for the subsequent data set.
Prior art system flash/EEPROM architectures do not allow independent access to the data in one on-chip buffer to while another buffer is used for concurrent read or program operation. Thus, operations that include mixture of reads and writes, like garbage collection, cannot be pipelined in prior art systems.