Flash memory is a non-volatile memory component, widely used in modern electronic systems. The growing demand for high performance computing along with high capacity data storage stimulates flash technology development. Flash enhancement includes aggressive scaling, which positions NAND flash as currently the most dense semiconductor circuit (as low as 20 nm for half pitch in 2011), and drives innovations for faster read and program (write) operations. New product markets, such as e-books, tablets, smartphones, and portable media players have no practical alternative to flash due to their thin structure, and require fast memory response.
Fast access time is a fundamental requirement while employing flash for system usage. As flash storage capacity doubles every two years, improvements in read speed are required to cope with the growing desired data transfer rates. However, as technology shrinks, and multi-level cell (MLC) techniques are implemented, achieving low read latency becomes harder. Technology scaling forces accurate read voltages, which has direct effects on timing. Multi-level cell (MLC) architectures, aimed to reduce cost, allow the storage of multiple bits per memory cell, but so doing prolongs the read and write operations because of accurate, tightly spaced sensing voltages and a need for multiple comparisons in order to determine the cell's content.
The requirement for high read speed is addressed in several ways. NOR array architecture enables random access read but reduces the memory density nearly twofold relative to NAND array, due to additional contacts between memory cells. It also entails a more complex manufacturing process. Circuit optimizations include cache read and charge pump optimization. Architecture directions include multi-page programming, multiple plains on chip, as well as multi-channel and multi-chip architectures for parallel read (RAID-like structures).
Flash memory cell physical structure is a MOSFET transistor with a floating gate layer, surrounded by a dielectric stack. The information is expressed as the amount of electrical charge stored in the floating gate layer. As a result, an erased cell (no electrons in the floating gate) will have a lower threshold voltage than a programmed cell. The information is read by checking the conductivity of the memory cell with one or more gate voltages. An erased flash cell has a threshold voltage below 0V. If written, cell threshold voltage is positive but smaller than 4V. (The actual numbers vary with technology.) Biasing the gate of the selected cell with 0 v, the cell will conduct current if it is erased, and will not conduct current if it has been written to. For gate voltages higher than 4V, the cell will always conduct current. This characteristic is exploited during read when a set of cells are connected in series and all those cells, except the one being addressed, must operate as pass transistors, i.e., conduct.
A Flash memory cell array can be organized in several architectures. The two common architectures are NOR and NAND. In the NOR architecture, cells are arranged in parallel, in a similar form to NMOS transistors in a NOR logic gate; in the NAND architecture, they are arranged in series. Read is performed faster (order of nanoseconds vs. microseconds) in the NOR architecture, whereas program and erase are performed faster in the NAND architecture.
NAND Flash cells are connected in series (resembling NMOS transistors in CMOS logic) and are organized in strings, each representing a bitline. (This is a drain-to-source connection.) Each bitline is connected to the drain or source of a first NMOS transistor of a string. The gates of all NMOS transistors that have the same position in their respective strings, referred to herein as residing in the same row, have their gates connected to the same wordline.
FIG. 1A shows a prior art NAND Flash memory cell array 100 with multiple CMOS transistors 102 that are arranged in an array. The memory array matrix consists of N wordlines (WL) 120(1)-120(N), connected in separate strings, each string holds typically up to 64 cells, and M bitlines (BL) 110(1)-110(M), typically up to 64 thousands.
Each NAND string is accessed through drain-select line (DSL) such as DSL 134. All strings are connected to source line through source-select line (SSL) 132. When we read a cell, its wordline is fixed at 0 v, while other cells in its string (bitline) are biased (through their wordlines) with high voltages (usually 4-5V), so that they operate as pass transistors regardless of their Vt value.
The amount of charge stored in the floating gate can be quantized into multiple, non-overlapping, contiguous ranges of charge level, each of which is conveniently referred to herein as a level. A single level cell (SLC) stores one program level and the erase level, hence one bit per cell. Multi-level cells (MLC) have more than one program level and can store multiple bits per cell. Switching to MLC architectures will result in additional read latency, e.g. additional 50 μs for the time that memory cells are transferred to temporary internal memory buffer during read (usually SRAM). NAND read performance is determined by two components: NAND array access time, and data transfer rate across the bus.
As NAND page size increases, latency becomes large, especially for small reads. NAND array read transfer time SLC tR time is 20-25 μs MAX, MLC is 50 μs MAX (typically). The push for density results in slower NAND array access.
FIG. 1B illustrates a prior art NOR flash memory array and FIG. 1C illustrates a prior art NOR flash read circuit. The NOR Flash memory array 101 consists of N wordlines (WL) and M bitlines (BL). Each memory cell is connected both to bitline and source.
The Read operation in NOR flash is commonly achieved by either of two methods: current sensing, and voltage sensing. With current sensing, the drain current is kept fixed, while with voltage sensing the gate voltage remains fixed. FIG. 1C describes current sensing. A flash memory cell 141 placed on a specific bitline (BL). The bitline parasitic capacitance Cbl 151. The bitline is connected via a column decoder 143 and an additional transistor 145 to a node that is connected to a load resistor 147 and to a positive input of comparator 149. A reference flash memory cell 142 having capacitance Cbl 152 is connected via a column decoder 144 and an additional transistor 146 to a node that is connected to a load resistor 148 and to a negative input of comparator 149. During a precharge phase, CBL is being charged. In evaluation phase, CBL is discharged if VWL is above the voltage threshold of the flash memory cell, and retains the charge otherwise. If CBL was discharged than TBIAS is open and the current through R1 is high. Otherwise, the current is low. Determining the logic level is done by comparing the currents on the memory and reference sides. If the cell is programmed (for multi-level cell: has a higher threshold voltage than VWL) the sensed information does not depend on the evaluation time (since CBL is not discharged). The errors resulted from shorten evaluation time share uni-directional also in the case of NOR flash and speculative early sensing concept can be applied.
Dynamic-random-access memory (DRAM) is a volatile memory component. Due to its performance, it is the technology-of-choice for computer memory today. DRAM cell is a capacitor which states of charged or discharged represent the state of the stored binary digit. DRAM memory array is organized in a matrix form, consists of rows (wordlines) and columns (bitlines). FIG. 12 illustrates a DRAM cell 1200 that is represented by a capacitor 1201, DRAM transistor 1202, bit line 1204, word line 1203 and a sense terminal 1205. FIG. 13 illustrates an array 1300 that includes twelve DRAM cells that are arranged in three rows (coupled to word lines WL1-WL3) and in four columns (coupled to bit lines BL1-BL4). In a single row, each DRAM cell is connected to unique column through access device (typically MOSFET transistor). Access devices can be switch on or off. If access device is on, and all other access devices on the same column are off, the DRAM cell content would be read, else it would be ignored. All access devices on the same row are controlled by a row (wordline) signal.
DRAM read process is as follows: access device is turned on (while and all other access devices on the same column are off), and the capacitor charge (if any) is discharged through the access device to the column associated with the DRAM cell. Each column is connected to sense amplifier that translates the charge or current on the bitline to a logic signal of ‘0’ or ‘1’. The output signal is latched in a temporary buffer and may be transferred out of the memory chip according to chip's control signals.
Prior to read, the capacitor state can be either charged or discharged. If discharged, the read time (consists of discharging the capacitor and translating the charge or current to logic signal by the sense amplifier) can be shorten to zero. Else, the read time depends on the capacitor electrical parameters (area, capacitance, resistance, etc.) and on the bitline and sense amplifier electrical parameters.
In manufacturing process, the electrical parameters may vary from one cell to another, and from one bitline to another, making the required read time to vary from cell to cell. Typically, the manufacturer provides the “worst-case” read time in order to get valid data in any case.
If the time from turning on the access device to latching the output logical signal of the sense amplifier is shorten, the latched data might have errors (depends on the electrical parameters of the cells).
The data errors resulting from shorten sensing (as described above) would always be uni-directional. Only cells that store enough charge that represent logic ‘1’ would be error-prone (since they might not be fully discharged during the shorten sensing).
Using error detection code (Berger code) any amount of errors that resulting from shorten sensing can be detected. If other error detection code is used, the errors can be detected under some probability.
Since the process of error detection is short with respect to DRAM data read, it can be used beneficially to increase read speed, namely, output the data in shorten read attempts in case it is valid.
If data is not valid, consecutive bitline sensing would be performed at a later time to increase the probability of valid data sensing.
In the process of performing shorten read attempts, statistical information about row or cell sensing time can be gathered and used in future read attempts.
It is therefore the purpose of the current invention to expedite the access to memory, thereby increasing overall performance, and to do so without compromising the correctness of the written or read data.