This invention pertains to the field of semiconductor non-volatile data storage system architectures and their methods of operation, and has application to data storage systems based on flash electrically erasable and programmable read-only memories (EEPROMs) and other types of memory system.
A common application of flash EEPROM devices is as a mass data storage subsystem for electronic devices. Such subsystems are commonly implemented as either removable memory cards that can be inserted into multiple host systems or as non-removable embedded storage within the host system. In both implementations, the subsystem includes one or more flash devices and often a subsystem controller.
Flash EEPROM devices are composed of one or more arrays of transistor cells, each cell capable of non-volatile storage of one or more bits of data. Thus flash memory does not require power to retain the data programmed therein. Once programmed however, a cell must be erased before it can be reprogrammed with a new data value. These arrays of cells are partitioned into groups to provide for efficient implementation of read, program and erase functions. A typical flash memory architecture for mass storage arranges large groups of cells into erasable blocks, wherein a block contains the smallest number of cells (unit of erase) that are erasable at one time.
In one commercial form, each block contains enough cells to store one sector of user data plus some overhead data related to the user data and/or to the block in which it is stored. The amount of user data included in a sector is the standard 512 bytes in one class of such memory systems but can be of some other size. Because the isolation of individual blocks of cells from one another that is required to make them individually erasable takes space on the integrated circuit chip, another class of flash memories makes the blocks significantly larger so there is less space required for such isolation. But since it is also desired to handle user data in much smaller sectors, each large block is often further partitioned into individually addressable pages that are the basic unit for reading and programming user data; although the size of a write page need not be the same as the size of a read page, in the following they are treated as being the same in order to simplify the discussion. Each page usually stores one sector of user data, but a page may store a partial sector or multiple sectors. A “sector” is used herein to refer to an amount of user data that is transferred to and from the host as a unit.
The subsystem controller in a large block system performs a number of functions including the translation between logical addresses (LBAs) received by the memory sub-system from a host, and physical block numbers (PBNs) and page addresses within the memory cell array. This translation often involves use of intermediate terms for a logical block number (LBN) and logical page. The controller also manages the low level flash circuit operation through a series of commands that it issues to the flash memory devices via an interface bus. Another function the controller performs is to maintain the integrity of data stored to the subsystem through various means, such as by using an error correction code (ECC).
FIG. 1 shows a typical internal architecture for a flash memory device 131. The primary features include an input/output (I/O) bus 411 and control signals 412 to interface to an external controller, a memory control circuit 450 to control internal memory operations with registers for command, address and status signals. One or more arrays 400 of flash EEPROM cells are included, each array having its own row decoder (XDEC) 401 and column decoder (YDEC) 402, a group of sense amplifiers and program control circuitry (SA/PROG) 454 and a data register 404. Presently, the memory cells usually include one or more conductive floating gates as storage elements but other long-term electron charge storage elements may be used instead. The memory cell array may be operated with two levels of charge defined for each storage element to therefore store one bit of data with each element. Alternatively, more than two storage states may be defined for each storage element, in which case more than one bit of data is stored in each element.
If desired, a plurality of arrays 400, together with related X decoders, Y decoders, program/verified circuitry, data registers, and the like are provided, for example as taught by U.S. Pat. No. 5,890,192, issued Mar. 30, 1999, and assigned to SanDisk Corporation, the assignee of this application, which is hereby incorporated by this reference. Related memory system features are described in U.S. Pat. No. 6,426,893, issued Jul. 30, 2002, and assigned to SanDisk Corporation, the assignee of this application, which application is also expressly incorporated herein by this reference. These patents describe having multiple semi-autonomous arrays, referred to as planes or “quads” on a single memory chip.
The external interface I/O bus 411 and control signals 412 can include the following:
CE—Chip Enable.Used to activate flash memory interface.RE—Read Enable.Used to indicate the I/O bus is being usedto transfer data from the memory array.WE—Write Enable.Used to indicate the I/O bus is being usedto transfer data to the memory array.ALE—AddressIndicates that the I/O bus is being used toLatch Enabletransfer address information.CLE—CommandIndicates that the I/O bus is being used toLatch Enable.transfer command information.IO[7:0] -This I/O bus is used to transfer dataAddress/Data Busbetween controller and the flash memorycommand, address and data registers of thememory control 450.
In addition to these signals, it is also typical that the memory have a means by which the storage subsystem controller may determine that the memory is busy performing some task. Such means could include a dedicated signal or a status bit in an internal memory register that is accessible while the memory is busy.
This interface is given only as an example as other signal configurations can be used to give the same functionality. FIG. 1 shows only one flash memory array 400 with its related components, but a multiplicity of such arrays can exist on a single flash memory chip that share a common interface and memory control circuitry but have separate XDEC 401, YDEC 402, SA/PROG 454 and DATA REG 404 circuitry in order to allow parallel read and program operations. More generally, there may be one or two additional such data registers typically arranged into the sort of master slave arrangements developed further in U.S. Pat. No. 6,560,143, which is hereby incorporated by reference.
Data is transferred from the memory array through the data register 404 to an external controller via the data registers' coupling to the I/O bus IO[7:0] 411. The data register 404 is also coupled the sense amplifier/programming circuit 454. The number of elements of the data register coupled to each sense amplifier/programming circuit element may depend on the number of bits stored in each storage element of the memory cells, flash EEPROM cells each containing one or more floating gates as the storage elements. Each storage element may store a plurality of bits, such as 2 or 4, if the memory cells are operated in a multi-state mode. Alternatively, the memory cells may be operated in a binary mode to store one bit of data per storage element.
The row decoder 401 decodes row addresses for the array 400 in order to select the physical page to be accessed. The row decoder 401 receives row addresses via internal row address lines 419 from the memory control logic 450. A column decoder 402 receives column addresses via internal column address lines 429 from the memory control logic 450.
FIG. 2 shows an architecture of a typical non-volatile data storage system, in this case employing flash memory cells as the storage media. In one form, this system is encapsulated within a removable card having an electrical connector extending along one side to provide the host interface when inserted into a receptacle of a host. Alternatively, the system of FIG. 2 may be embedded into a host system in the form of a permanently installed embedded circuit or otherwise. The system utilizes a single controller 301 that performs high-level host and memory control functions. The flash memory media is composed of one or more flash memory devices, each such device often formed on its own integrated circuit chip. The system controller and the flash memory are connected by a bus 302 that allows the controller 301 to load command, address, and transfer data to and from the flash memory array. (The bus 302 includes 412 and 411 of FIG. 1.) The controller 301 interfaces with a host system (not shown) with which user data is transferred to and from the flash memory array. In the case where the system of FIG. 2 is included in a card, the host interface includes a mating plug and socket assembly (not shown) on the card and host equipment.
The controller 301 receives a command from the host to read or write one or more sectors of user data starting at a particular logical address. This address may or may not align with the first physical page in a block of memory cells.
In some prior art systems having large capacity memory cell blocks that are divided into multiple pages, the data from a block that is not being updated needs to be copied from the original block to a new block that also contains the new, updated data being written by the host. In other prior art systems, flags are recorded with the user data in pages and are used to indicate that pages of data in the original block that are being superceded by the newly written data are invalid. A mechanism by which data that partially supercedes data stored in an existing block can be written without either copying unchanged data from the existing block or programming flags to pages that have been previously programmed is described in U.S. Pat. No. 6,763,424, which application is expressly incorporated herein by this reference.
Non-volatile memory systems of this type are being applied to a number of applications, particularly when packaged in an enclosed card that is removable connected with a host system. Current commercial memory card formats include that of the Personal Computer Memory Card International Association (PCMCIA), CompactFlash (CF), MultiMediaCard (MMC) and Secure Digital (SD). Other systems include USB devices, such as memory cards including cards with two sets of contacts, such as those described in U.S. patent application Ser. No. 10/826,801 and U.S. Ser. No. 10/826,796, both filed Apr. 16, 2004, and hereby incorporated by reference. One supplier of these cards is SanDisk Corporation, assignee of this application. Host systems with which such cards are used include personal computers, notebook computers, hand held computing devices, cameras, audio reproducing devices, and the like. Flash EEPROM systems are also utilized as bulk mass storage embedded in host systems.
Such non-volatile memory systems include one or more arrays of floating-gate memory cells and a system controller. The controller manages communication with the host system and operation of the memory cell array to store and retrieve user data. The memory cells are grouped together into blocks of cells, a block of cells being the smallest grouping of cells that are simultaneously erasable. Prior to writing data into one or more blocks of cells, those blocks of cells are erased. User data are typically transferred between the host and memory array in sectors. A sector of user data can be any amount that is convenient to handle, preferably less than the capacity of the memory block, often being equal to the standard disk drive sector size, 512 bytes. In one commercial architecture, the memory system block is sized to store one sector of user data plus overhead data, the overhead data including information such as an error correction code (ECC) for the user data stored in the block, a history of use of the block, defects and other physical information of the memory cell block. Various implementations of this type of non-volatile memory system are described in the following United States patents and pending applications assigned to SanDisk Corporation, each of which is incorporated herein in its entirety by this reference: U.S. Pat. Nos. 5,172,338, 5,602,987, 5,315,541, 5,200,959, 5,270,979, 5,428,621, 5,663,901, 5,532,962, 5,430,859 and 5,712,180, and application Ser. No. 08/910,947, filed Aug. 7, 1997, and Ser. No. 09/343,328, filed Jun. 30, 1999. Another type of non-volatile memory system utilizes a larger memory cell block size that stores multiple sectors of user data.
One architecture of the memory cell array conveniently forms a block from one or two rows of memory cells that are within a sub-array or other unit of cells and which share a common erase gate. U.S. Pat. Nos. 5,677,872 and 5,712,179 of SanDisk Corporation, which are incorporated herein in their entirety, give examples of this architecture. Although it is currently most common to store one bit of data in each floating gate cell by defining only two programmed threshold levels, the trend is to store more than one bit of data in each cell by establishing more than two floating-gate transistor threshold ranges. A memory system that stores two bits of data per floating gate (four threshold level ranges or states) is currently available, with three bits per cell (eight threshold level ranges or states) and four bits per cell (sixteen threshold level ranges) being contemplated for future systems. Of course, the number of memory cells required to store a sector of data goes down as the number of bits stored in each cell goes up. This trend, combined with a scaling of the array resulting from improvements in cell structure and general semiconductor processing, makes it practical to form a memory cell block in a segmented portion of a row of cells. The block structure can also be formed to enable selection of operation of each of the memory cells in two states (one data bit per cell) or in some multiple such as four states (two data bits per cell), as described in SanDisk Corporation U.S. Pat. No. 5,930,167, which is incorporated herein in its entirety by this reference.
In addition to increasing the capacity of such non-volatile memories, there is a search to also improve such memories by increasing their performance and decreasing their susceptibility to error. Memories such as those described above that utilize large block management techniques perform a number of data management techniques on the memory's file system, including garbage collection, in order to use the memory area more effectively. Such garbage collection schemes involve a data relocation process including reading data from one (or more) locations in the memory and re-writing it into another memory location. (In addition to many of the above incorporated references, garbage collection is discussed further in, for example, “A 125-mm2 1-Gb NAND Flash Memory With 10-MByte/s Program Speed”, by K. Imamiya, et al., IEEE Journal of Solid-State Circuits, Vol. 37, No. 11, November 2002, pp. 1493-1501, which is hereby incorporated in its entirety by this reference.) This data relocation time is a main contributor to all garbage collection routines. Prior art methods describe the data relocation operation as a consecutive data read, then data integrity check and error correction, if necessary, before writing the data to a new location, so that there is a high constant performance penalty of data transfer and verification. In the case of data error, additional time must be spent to correct the data before write.
Other prior art methods exploit an on-chip copy feature, writing the data from one location to another without a pre-check of the data integrity. Such a method is described, for example, in “High Performance 1-Gb NAND Flash Memory With 0.12 μm Technology”, by J. Lee, et al., IEEE Journal of Solid-State Circuits, Vol. 37, No. 11, November 2002, pp. 1502-1509, which is hereby incorporated in its entirety by this reference. The integrity check is done concurrently with the data write so that, in the case of error, there is a high probability of the need to rewrite the entire block with a high penalty in performance and time-out/latency.
A particular on-chip copy mechanism, shown in FIG. 3, and is presented in more detail in U.S. Pat. No. 6,266,273, which is hereby incorporated by reference. As indicated by step (1) in FIG. 3, a data set, such as a page, is read from a source location to a read/program slave data register. The architecture shown in FIG. 3 uses a master-slave arrangement for its data registers and the read copy of the data set is transferred in step (2) to the master register. In step (3), the copied data set is then relocated to the destination location in parallel with transferring it from the master data register to the controller. This technique allows for on-chip relocation while also transferring a copy of the data to the controller where it can be checked.
According to the prior art, when, from time to time, flash memory media management algorithms need to copy data from one location to another in the flash memory array, there are two basic methods used to achieve this. The first method is to read data from the array to a buffer, transfer the data to the controller and then transfer back from the controller to the new location in flash before programming. The second method is to read the data from the array to the buffer and then program directly back into a new array location. The second method is referred to as on-chip copy.
The second method gives a shorter copy time because there is no transfer from controller to flash. With high levels of read and programming parallelism, the differences can be significant. However, the performance comes at a penalty of flexibility. On-chip copy mechanisms currently restrict operation to copying within a plane, so that it is not possible to transfer data between two different chips or between two planes on the same chip. This means that either data must be organized such that it will always be copied between two locations in the same plane of the same chip or separate reads and writes must be used. The latter approach results in performance that varies according to the location of the source and target for the data. A potentially large amount of buffering is required in the controller to allow parallel operation in the flash chips.
Consequently, the operation of such memory systems could be greatly improved if data relocation operations could be extended to allow relocations between different planes or chips without the need to buffer the data in the controller. This is particularly true for memory systems relying upon large block data structures, where such garbage collection operations place large demands on the management of the memory.