High speed mass storage devices built with non-volatile memory chips are suitable for recording and playing back concurrent data/video streams under real-time conditions, e.g. for video productions in film and broadcast studio environments. Higher spatial resolutions, higher frame rates and uncompressed multiple stream recording for 3D productions are increasing the requirements for storage media bandwidth and processing power capabilities. Since some years non-volatile memories like NAND flash devices are used for recording digital video. To fulfil the special storage requirements of digital video production with off-the-shelf NAND flash devices in mobile embedded storage media, a special handling of data and of NAND flash memories is required.
Writing data to state-of-the-art NAND flash memories requires a special processing of data flow caused by the internal architecture of NAND flash memories. Such non-volatile memory devices are organised as an array of programmable and readable memory pages, comprising data blocks of some kilobytes size. If data are to be written into a NAND flash memory device, it is necessary to program a full page of some kilobytes size (PAGESIZE). Additionally, the flash device needs a non-negligible processing time for writing such page into its internal memory array. During this programming time no other read/write commands can be executed on that flash device. Therefore the memory bus resources connecting the flash device with its controller will be unused during most of the time. To optimise utilisation of memory bus resources, it is known to connect multiple NAND flash devices to the same address/memory bus and to use them in an interleaved manner: the flash devices are processed one after the other, and while the first device is busy following a programming command, the memory bus resources are used to handle the other devices sharing the same memory bus. Manufacturers of NAND flash devices are supporting such kind of processing by integrating multiple dies of a flash device in one integrated circuit (IC), sharing a common external memory bus. Therefore such interleaved processing is feasible on a single IC. Depending on the time required for the programming/reading operations, it is possible to choose the number of interleaved devices in such a way that the bandwidth of the controlling memory bus is used in an optimum manner. For example, current NAND flash devices may have an 8-bit memory bus as external interface that can be driven with a speed of 40 MHz. The memory bus resource has a full bandwidth of approximately 40 MB/s, but the NAND flash device is written by programming operations of 2 KB pages that may last up to 600 μs. This will result in a sustained bandwidth of approximately 3.2 MB/s. In order to use the full 40 MB/s bandwidth of the memory bus resource, it would be necessary to connect 12 NAND flash devices to the memory bus and to use them in an interleaved manner.
To provide bandwidths higher than one memory bus can handle, data are written in parallel to multiple memory buses, whereby multiple flash pages on different flash devices are programmed concurrently. A corresponding structure of flash memories is shown in FIG. 1. A known controller 15 passes input data 10 to, or output data 10 from, Y parallel memory buses MB1 to MBY, to each of which X flash memory devices MDy.1 to MDy.X are connected. However, the internal structure shown within controller 15 does not show the prior art but is explained below in connection with the invention.
This kind of flash memory arrangement has the advantage that almost unlimited bandwidths can be provided for flash storage media. But increasing bandwidth leads also to an increasing amount of data that need to be written coherently. Only after a block of Y*PAGESIZE of data is available, these data can be programmed to the corresponding NAND flash memory devices on all Y parallel memory buses. To guarantee a specific minimum read or write bandwidth for input/output data 10, it is necessary to read/write the data in an interleaved manner as mentioned above. That implies the need to read/write sequential blocks of a size X*Y*PAGESIZE.