Flash memory is a commonly used type of non-volatile memory in widespread use as mass storage for consumer electronics, such as digital cameras and portable digital music players for example. The density of a presently available Flash memory component, consisting of 2 stacked dies, can be up to 32 Gbits (4 GB), which is suitable for use in popular USB Flash drives, since the size of one Flash component is typically small.
The advent of 8 mega pixel digital cameras and portable digital entertainment devices with music and video capabilities has spurred demand for ultra-high capacities to store the large amounts of data, which may not be met by the single Flash memory device. Therefore, multiple Flash memory devices are combined together into a system to effectively increase the available storage capacity. For example, Flash storage densities of 20 GB may be required for such applications.
FIG. 1 is a block diagram of a prior art system 10 integrated with a host system 12. The prior art system 10 includes a memory controller 14 in communication with host system 12, and multiple non-volatile memory devices 16. The host system 12 includes a processing device such as a microcontroller, microprocessor, or a computer system. The prior art system 10 of FIG. 1 is organized to include one channel 18, with the memory devices 16 being connected in parallel to channel 18. Those skilled in the art should understand that the prior art system 10 can have more or fewer than four memory devices connected to it.
Channel 18 includes a set of common buses, which include data and control lines that are connected to all of its corresponding memory devices. Each memory device is enabled or disabled with respective chip select (enable) signals CE1#, CE2#, CE3# and CE4#, provided by memory controller 14. The “#” indicates that the signal is an active low logic level signal. At most one of the chip select signals is typically selected at one time. The memory controller 14 is responsible for issuing commands and data, via the channel 18, to a selected memory device in response to the operation of the host system 12. Read data output from the memory devices is transferred via the channel 18 back to the memory controller 14 and host system 12. Operation of the prior art system 10 can be asynchronous or synchronous. FIG. 1 illustrates an example of a synchronous system that uses a clock (CK), which is provided in parallel to each memory device 16 to synchronize data transfer on the channel 18. The prior art system 10 is generally said to include a multi-drop bus, in which the memory devices 16 are connected in parallel with respect to channel 18.
In the prior art system 10, non-volatile memory devices 16 may be (but not necessarily) substantially identical to each other, and are typically NAND flash memory devices. Those skilled in the art should understand that flash memory may be organized into banks, and that each bank may be organized into blocks to facilitate block erasure. Some commercially available NAND flash memory devices have two banks of memory.
There are specific issues that can adversely impact performance of the system. The structure of the prior art system 10 imposes physical performance limitations. There is a large number of parallel signals extending across the system, and the signal integrity of the signals they carry may be degraded by crosstalk, signal skew, and simultaneous switching noise (SSN). Input/output power consumption in such a system becomes an issue as each signal track between the flash controller and flash memory devices is frequently charged and discharged for signaling. With increasing system clock frequencies, the power consumption will increase.
There is also a practical limit to the number of memory devices which can be connected in parallel to the channel since the drive capability of a single memory device is small relative to the loading of the long signal tracks. Furthermore, as the number of memory devices increase, more chip enable signals (CE#) are required, and CK may need to be routed to the additional memory devices, all of which are longer as they are routed to the memory devices. Clock performance issues due to extensive clock distribution are well known in the art, become an issue in large Prior Art systems with many memory devices 16. Therefore, for a Prior Art memory system to include a large number of memory devices, either the memory devices are spread across multiple channels or the frequency operation of the memory system would be limited; either option involves compromises. A controller having multiple channels and additional chip enable signals increases the cost of the system. Otherwise, the system is limited to a small number of memory devices.
In the multi-drop prior art system 10 of FIG. 1, the data width of each memory device 16 must be the same. For example, if the data channel width is 32 bits then each memory device 16 must be a x32 device. If an alternate multi-drop system has an 8 bit data channel width, then the x32 memory devices cannot be used. Instead, different x8 memory devices need to be used instead. Accordingly, a memory device manufacturer will produce versions of the same memory device with different data widths in order to accommodate the possible system structures.
As consumer demand for smaller form factor products increases, manufacturers need to find ways to minimize the area or space occupied by semiconductor chips, such as the prior art system 10 of FIG. 1. Although each memory device chip can be small, the package encapsulating the chip may have a size largely determined by the number of package pins for coupling signals between the chip input/output pads and the printed circuit board (PCB) traces. Unfortunately, the prior art system 10 of FIG. 1 is not suited for applications requiring a minimized PCB area. Each memory device and the memory controller will occupy a larger PCB area due to x8, x16 or even x32 data channel widths because the package size increases as the data width increases. If the data width is reduced to minimize the package size, then performance is adversely impacted since the aggregate memory system peak bandwidth is reduced.
It is, therefore, desirable to provide a high performance system which consumes a minimal amount of board area.