Mass storage non-volatile memory (NVM) devices enable read and/or write access to data containing many bytes. Mass storage devices are typically, but not only, used in applications such as hard disks, or digital video storage devices, such as for digital cameras. Throughout the specification and claims, the term “mass storage device” refers not only to memory devices that are capable of the storage functionality of hard disks or video storage devices and the like, but also to memory devices capable of storing and providing access to at least 512 megabyte (MB) of data or to memory devices requiring very fast programming and read access rates. The amount of data accessible with the mass storage device may include “blocks” of data. A “block” is defined as a basic amount of data containing a certain amount of bytes, e.g., 256 bytes (256B), 512B, 528B or any other number of bytes.
Chip architectures in mass storage devices must support fast data transfer rates. Prior art mass storage devices typically include one or more buffer memory devices (buffer memory). The buffer memory is used to receive and temporarily store data at the high data transfer rate supported by the particular communication link being used. After the data is received, it may then be read from the buffer memory and processed.
In order to efficiently transfer data from a sending device to a receiving device, such as between a mass storage device (e.g., operating as a hard disk) and a buffer memory, burst data transfers may be used. A burst data transfer is a series of data transfers that occurs without an interrupt between one device and another device. A receiving device that is able to receive burst transfers may typically include both a buffer memory and some sort of data management system for managing the burst transfers. The data management system may be used to perform a number of functions. For example, the data management system may determine whether to enable the next transfer of a burst from a sending device. This determination is largely based on whether there is enough space available in the buffer memory of the receiving device to receive the burst without corrupting previously stored data. The data management system may also be used to coordinate the re-transmission and rewriting of a burst into the buffer memory if an originally transmitted burst was determined invalid.
Prior art data management systems for accomplishing these functions may include a microprocessor and a software routine, or alternatively a relatively complex state machine. However, this may have the disadvantage of significant system overhead, thereby reducing the performance level of a receiving device, or alternatively, requiring high cost control circuitry to achieve the desired performance level.
Cache memory may be typically used to bridge the gap between fast processor cycle times and slow memory access times. A cache is a small amount of very fast, expensive, preferably zero wait state memory that is used to store a copy of frequently accessed code and data from system memory. The microprocessor can operate out of this very fast memory and thereby reduce the number of wait states that must be interposed during memory accesses. Static random access memories (SRAMs) are typically used as cache memories.
System RAM speed may be controlled by bus width and bus speed. Bus width refers to the number of bits that may be sent to the processor simultaneously, and bus speed refers to the number of times a group of bits may be sent each second. A bus cycle occurs every time data travels from memory to the processor. Bit latency refers to the number of clock cycles needed to read a bit of information. For example, RAM rated at 100 MHz is capable of sending a bit in 1×10−8 seconds, but may take 5×10−8 seconds to start the read process for the first bit.
To compensate for latency, processors typically use a technique called burst mode. Burst mode depends on the expectation that data requested by the processor will be stored in sequential memory cells. The memory controller anticipates that whatever the processor is working on will continue to come from this same series of memory addresses, so it reads several consecutive bits of data together. This means that only the first bit is subject to the full effect of latency; reading successive bits takes significantly less time.
Accordingly, it is desirable to provide a chip architecture in mass storage devices for reducing first bit latency and yet maintaining fast read throughput.
Writing data into an NVM mass storage device usually comprises programming bits in the NVM array according to the input data. Programming NVM bits typically comprises application of one or more programming pulses followed by a verification phase, in which the bits are read to determine their programming state. Typically, multiple program pulse—program verify cycles may be required to complete programming all the bits.
In mass storage devices a fast write rate is usually required. A fast programming rate may be achieved in one or more ways, such as but not limited to, programming a large number of bits in parallel, reducing the number of program pulse—program verify cycles, shortening each phase in these cycles, and shortening the overhead times within the programming procedure. The data to be programmed to the NVM array is usually loaded upfront to the device and temporarily stored in a volatile data buffer (e.g., an SRAM array). After applying a programming pulse, the data read out from the NVM array is usually stored in a second volatile buffer, and program verification is carried out by comparing the data of the two buffers (the original data to be programmed and the temporarily read out data). Such a method requires two separate buffers and increases the die size.
It is therefore desired to provide a chip architecture in mass storage devices that both supports overhead time reduction within program—program verify cycles and enables program verify operations without a need for a second buffer.