Mobile electronic devices, such as, for example, digital cameras, portable digital assistants, portable audio/video players and mobile terminals continue to require mass storage memory, preferably non-volatile memory with ever increasing capacities and speed capabilities. For example, presently available audio players can have between 256 Mbytes to 40 Gigabytes of memory for storing audio/video data. Non-volatile memory, for example, such as Flash memory and hard-disk drives are preferred since data is retained in the absence of power, thus extending battery life.
Presently, hard disk drives have high densities and can store 40 to 160 Gigabytes of data, but are relatively bulky. However, Flash memory, also known as a solid-state drive, is popular because of their high density, non-volatility, and small size relative to hard disk drives. The advent of multi-level cells (MLC) further increases the Flash memory density for a given area relative to single level cells. Those of skill in the art will understand that Flash memory can be configured as NOR Flash, NAND Flash or any other type of Flash memory configuration. NAND Flash has higher density per given area due to its more compact memory array structure. For the purposes of further discussion, references to Flash memory should be understood as being any type of Flash devices, such as, for example, NOR and NAND type Flash memory.
While existing Flash memory modules operate at speeds sufficient for many current consumer electronic devices, such memory modules likely will not be adequate for use in future devices where high data rates are desired. For example, a mobile multimedia device that records high definition moving pictures is likely to require a memory module with a programming throughput of at least 10 MB/s, which is not obtainable with current Flash memory technology with typical programming data rates of 7 MB/s. Multi-level cell Flash has a much slower rate of 1.5 MB/s due to the multi-step programming sequence required to program the cells.
The problem with many standard memory devices lies in their use of a parallel data interface for receiving and providing data. For example, some memory devices provide 8, 16 or 32 bits of data in parallel at an operating frequency of up to 30 MHz. Standard parallel data interfaces providing multiple bits of data in parallel are known to suffer from well known communication degrading effects such as cross-talk, signal skew and signal attenuation, for example, which degrades signal quality, when operated beyond their rated operating frequency. In order to increase data throughput, a memory device having a serial data interface has been disclosed in commonly owned U.S. Patent Publication No. 20070076479, which receives and provides data serially at a frequency, for example, 200 MHz. The memory device described in U.S. Patent Publication No. 20070076479 can be used in a system of memory devices that are serially connected to each other, as described in commonly owned U.S. Provisional Patent Application No. 60/902,003 filed Feb. 16, 2007, the content of which is incorporated herein by reference in its entirety.
FIG. 1A shows a system of a plurality of memory devices that are serially connected to each other, as described in U.S. Patent Publication No. 20070076479. Referring to FIG. 1A, a serial interconnection 5 includes a plurality of memory devices that are connected in series with a memory controller. The memory controller includes a system interface for receiving system commands and data from the system in which the serial interconnection is integrated, and provides read data to the system. In particular, Device 0 is comprised of a plurality of data input ports (SIP0, SIP1), a plurality of data output ports (SOP0, SOP1), a plurality of control input ports (IPE0, IPE1), and a plurality of control output ports (OPE0, OPE1). These data and control signals are sent to the memory device 5 from the memory controller. A second memory device (Device 1) is comprised of the same types of ports as Device 0. Device 1 is interconnected to Device 0. For example, Device 1 can receive data and control signals from Device 0. One or more additional devices may also be interconnected alongside Device 0 and Device 1 in a similar manner. A last device (e.g., Device 3) in the series-connection provides data and control signals back to the memory controller after a predetermined latency. Each memory device (e.g., device 0, 1, 2 3,) outputs an echo (IPEQ0, IPEQ1, OPEQ0, OPEQ1) of IPE0, IPE1, OPE0, and OPE1 (i.e., control output ports) to the subsequent device. The signals can be passed from one device to a subsequent series-connected device. A single clock signal is provided to each of the plurality of series-connected memory devices.
FIG. 1B is a block diagram illustrating the core architecture of one of the memory devices shown in FIG. 1A. Memory device 10 includes a multiplicity of identical memory banks with their respective data, control and addressing circuits, such as memory bank A 12 and memory bank B 14, an address and data path switch circuit 16 connected to both memory banks 12 and 14, and identical interface circuits 18 and 20, associated with each memory bank for providing data to and for receiving data from the switch circuit 16. Memory banks 12 and 14 are preferably non-volatile memory, such as Flash memory, for example. Logically, the signals received and provided by memory bank 12 are designated with the letter “A”, while the signals received and provided by memory bank 14 are designated with the letter “B”. Similarly, the signals received and provided by interface circuit 18 are designated with the number “0”, while the signals received and provided by interface circuit 20 are designated with the number “1”. Each of the interface circuits 18 and 20 receives access data in a serial data stream, where the access data can include a command, address information and input data for programming operations, for example. In a read operation, each of the interface circuits provides output data as a serial data stream in response to a read command and address data. The memory device 10 further includes global circuits, such as a control interface 22 and status/ID register circuit 24, which provide global signals such as clock signal sclki and reset to the circuits of both memory banks 12 and 14 and the respective interface circuits 18 and 20. A further discussion of the aforementioned circuits now follows.
Memory bank 12 includes well known memory peripheral circuits such as sense amplifier and page buffer circuit block 26 for providing output data DOUT_A and for receiving input program data DIN_A, and row decoder block 28. Those of skill in the art will understand that block 26 also includes column decoder circuits. A control and predecoder circuit block 30 receives address signals and control signals via signal line ADDR_A, and provides predecoded address signals to the row decoders 28 and the sense amplifier and page buffer circuit block 26.
The peripheral circuits for memory bank 14 are identical to those previously described for memory bank 12. The circuits of memory bank B include a sense amplifier and page buffer circuit block 32 for providing output data DOUT_B and for receiving input program data DIN_B, a row decoder block 34, and a control and predecoder circuit block 36. Control and predecoder circuit block 36 receives address signals and control signals via signal line ADDR_B, and provides predecoded address signals to the row decoders 34 and the sense amplifier and page buffer circuit block 36. Each memory bank and its corresponding peripheral circuits can be configured with well known architectures.
In general operation, each memory bank is responsive to a specific command and address, and if necessary, input data. For example, memory bank 12 provides output data DOUT_A in response to a read command and a read address, and can program input data in response to a program command and a program address. Each memory bank can be responsive to other commands such as an erase command, for example.
In the example shown in FIG. 1B, path switch 16 is a dual port circuit which can operate in one of two modes for passing signals between the memory banks 12 and 14, and the interface circuits 18 and 20. First is a direct transfer mode where the signals of memory bank 12 and interface circuit 18 are passed to each other. Concurrently, the signals of memory bank 14 and interface circuit 20 are passed to each other in the direct transfer mode. Second is a cross-transfer mode where the signals of memory bank 12 and interface circuit 20 are passed to each other. At the same time, the signals of memory bank 14 and interface circuit 18 are passed to each other. A single port configuration of path switch 16 will be discussed later.
As previously mentioned, interface circuits 18 and 20 receive and provide data as serial data streams. This is for reducing the pin-out requirements of the chip as well as to increase the overall signal throughput at high operating frequencies. Since the circuits of memory banks 12 and 14 are typically configured for parallel address and data, converting circuits are required.
Interface circuit 18 includes a serial data link 40, input serial to parallel register 42, and output parallel to serial register 44. Serial data link 40 receives serial input data SIP0, an input enable signal IPE0 and an output enable signal OPE0, and provides serial output data SOP0, input enable echo signal IPEQ0 and output enable echo signal OPEQ0. Signal SIP0 (and SIP1) is a serial data stream which can each include address, command and input data. Serial data link 40 provides buffered serial input data SER_IN0 corresponding to SIPO and receives serial output data SER_OUT0 from output parallel to serial register 44. The input serial-to-parallel register 42 receives SER_IN0 and converts it into a parallel set of signals PAR_IN0. The output parallel-to-serial register 44 receives a parallel set of output data PAR_OUT0 and converts it into the serial output data SER_OUT0, which is subsequently provided as data stream SOP0. Output parallel-to-serial register 44 can also receive data from status/ID register 24 for outputting the data stored therein instead of the PAR_OUT0 data. Further details of this particular feature will be discussed later. Furthermore, serial data link 40 is configured to accommodate daisy chain cascading of the control signals and data signals with another memory device 10.
Serial interface circuit 20 is identically configured to interface circuit 18, and includes a serial data link 46, input serial-to-parallel register 48, and output parallel-to-serial register 50. Serial data link 46 receives serial input data SIP1, an input enable signal IPE1 and an output enable signal OPE1, and provides serial output data SOP1, input enable echo signal IPEQ1 and output enable echo signal OPEQ1. Serial data link 46 provides buffered serial input data SER_IN1 corresponding to SIP1 and receives serial output data SER_OUT1 from output parallel-to-serial register 50. The input serial-to-parallel register 50 receives SER_IN1 and converts it into a parallel set of signals PAR_IN1. The output parallel-to-serial register 48 receives a parallel set of output data PAR_OUT1 and converts it into the serial output data SER_OUT1, which is subsequently provided as data stream SOP1. Output parallel to serial register 48 can also receive data from status/ID register 24 for outputting the data stored therein instead of the PAR_OUT1 data. As with serial data link 40, serial data link 46 is configured to accommodate daisy chain cascading of the control signals and data signals with another memory device 10.
Control interface 22 includes standard input buffer circuits, and generates internal chip select signal chip_sel, internal clock signal sclki, and internal reset signal reset, corresponding to chip select (CS#), serial clock (SCLK) and reset (RST#), respectively. While signal chip_sel is used primarily by serial data links 40 and 46, reset and sclki are used by many of the circuits throughout memory device 10.
While the serial data interface provides performance advantages over parallel data interface architectures, these advantages can be offset by performance degradations in memory banks 12 and 14. More specifically, the push for increased memory density will adversely affect how quickly data can be sensed from the memory cells, especially NAND configured Flash memory cells. To illustrate this problem, a portion of a NAND configured Flash memory array of FIG. 1B is shown in FIG. 2.
Referring to FIGS. 1B and 2, memory bank 12 includes i sets of bitlines, where i is an integer number greater than 0, and each set includes an even bitline and an odd bitline. For example, bitline set 1 includes even bitline BL1_e and odd bitline BL1_o. Each bitline is connected to at least one NAND cell string, where each NAND cell string includes a plurality of non-volatile memory cells and access transistors connected in series between the respective bitline and a common source line CSL. The access transistors include a source select transistor for receiving a source select line signal SSL, and a ground select transistor for receiving a ground select line signal GSL. Connected serially between these two access transistors are a plurality of non-volatile memory cells, such as Flash memory cells. In the present example, there are 32 serially connected Flash memory cells, having gate terminals coupled to respective wordlines WL1 to WL32.
Sense amplifier and page buffer circuit block 26 includes i page buffer units 60, or one for each bitline set. Because the bitline pitch is narrow, a page buffer unit 60 is shared between the even and odd bitlines of a bitline set. Therefore selection transistors receiving even and odd selection signals BSLe and BLSo are required for selecting one bitline of the set to be coupled to the page buffer unit 60. Each page buffer unit 60 senses and latches data from the bitlines, and those skilled in the art will understand that the page buffer latches write data to be programmed. Each NAND cell string sharing common wordlines WL1-WL32, SSL, and GSL lines is referred to as a memory block, while the memory cells connected to one common wordline is referred to as a page. Those skilled in the art should understand how Flash read, program and erase operations are executed.
FIG. 3 is a circuit schematic of column select circuits of the sense amplifier and page buffer circuit block 26 for coupling data in the page buffer units 60 of FIG. 2 to data lines. The present example of FIG. 3 illustrates one possible logical decoding scheme, where a preset number of page buffers are associated with each of 16 data lines DL1 to DL16. In the present example, there are 16 identically configured dataline decoder circuits 70, one being coupled to each of datelines DL1 to DL16. The following description refers to the dataline decoder circuit 70 coupled to DL1. Dateline decoder circuit 70 includes 16 groupings of 32 page buffer units 60. In each grouping, the input/output terminal of one page buffer unit is coupled to a respective first stage n-channel pass transistor 72. All the first stage n-channel pass transistors are connected in parallel and controlled by first stage selection signals YA1 to YA32 to selectively couple one page buffer unit 60 to one second stage n-channel pass transistor 74. Since there is one second stage n-channel pass transistor 74 per grouping, there are a total of 16 second stage n-channel pass transistors 74 connected in parallel to DL1, each controlled by respective second stage selection signals YB1 to YB16. Because signals YA1 to YA32 and YB1 to YB16 are shared across all the dataline decoder circuits 70, the activation of one first stage selection signal and one second stage selection couples one page buffer unit 60 from each dataline decoder circuit 70, to a corresponding dataline.
In a read, program verify and erase verify operation, the cell data in the selected page should be sensed and latched in their corresponding page buffer units 60. Column decoding then selects which page buffer units to couple to the datelines. Sensing is dependent on the cell current generated by a selected memory cell, and the cell current is dependent on the number of cells in the NAND cell string. In the example of FIG. 2, the cell current is typically less than 1 (μA) for a 32 cell NAND string manufactured with a 90 nm process technology. Unfortunately, the push to increase memory array density to lower device cost results in the addition of more memory cells per NAND cell string. As a result, this cell current will further decrease, thereby requiring more sensitive sensing circuits and/or sensing time. Further compounding this problem is the bitline RC delay due to the physical length of the bitline, and junction capacitance of the NAND cell string as the number of cells per NAND cell string is increased. These physical changes in combination with advanced manufacturing process for reducing feature sizes further exacerbates the cell current problem. This problem with cell current is well known, as demonstrated by June Lee et al., “A 90-nm CMOS 1.8-V 2-Gb NAND Flash Memory for Mass Storage” Applications,” IEEE J. Solid-State Circuits, vol. 38, pp. 1934-1942, November 2003. Another further problem related to using advanced manufacturing processes is yield, where long bitlines introduce process uniformity issue across process steps, thereby reducing the yield per wafer as the potential for defects increases.
One possible solution to this problem may be to limit the number of memory cells per NAND cell string, and divide large memory arrays into multiple memory banks. An advantage of having multiple memory banks is the capability of transferring data directly between the memory banks without having to transfer data out from the memory device. The disadvantage of using multiple memory banks is that each bank requires its own set of sense amplifier and page buffer circuit block 26, thereby increasing additional circuit overhead and chip area. The complex circuitry and area overhead required for implementing direct bank to bank data transfer also consumes additional chip area.