The present invention generally relates to mass storage systems that comprise non-volatile memory devices and their use in personal computers and servers. In particular, this invention relates to the use of an array of multiple solid-state storage devices with a PCIe or similar system I/O interface, wherein an additional cross-link interface independent of the system bus can be used to synchronize the activity of all devices in the array. Additional functions of the cross-link interface may encompass sharing of parity data across the array.
Solid-state drives (SSDs) are in the process of replacing rotational magnetic hard disk drives at least on the level of system drives in personal computers and in server environments. The currently prevailing non-volatile memory technology employed in SSDs uses NAND flash memory, primarily because of its low cost per bit. NAND flash memory, however, has several functional disadvantages such as limited endurance and data retention and relatively high initial access latency. Another drawback of NAND flash memory is that it cannot scale with future process technologies, since proximity effects in the form of electrical fields can easily alter the programming charge of the floating gates, and thereby alter the bit contents of the NAND flash memory cell. It is therefore conceivable that alternative memory technologies, for example, magneto resistive random access memory (MRAM), phase change memory (PCM) or resistive memory (RRAM), may become the next non-volatile memory media of choice.
Current SSDs have been developed primarily as drop-in replacement for existing hard disk drives. This development was driven primarily by the need to facilitate market acceptance and easy migration from mechanical hard disk drives to the new solid-state media. Accordingly, the same interfaces and protocol are employed, using advanced technology attachment (ATA) protocol and command sets and adopting existing file systems, for example, Microsoft® Windows NT File System (NTFS), for NAND flash-based media. At the same time, compared to other memory technologies, NAND flash memory is limited by initial access latencies in the order of several hundreds of micro seconds. Even if those access latencies are substantially shorter than those of conventional hard disk drives, they still allow use of a relatively slow flash controller. For example, in the case of a serial ATA interface, the preferred practice is to use a SATA controller with a flash translation layer and several back-end channels for parallel access of multiple flash memory devices or a combination of several such arrangements in combination with a PCIe-based RAID controller. Because of the inherent properties of NAND flash memory there is currently no huge incentive for developing alternative access technologies.
It is conceivable that future non-volatile memory technologies will allow faster access of an array of solid-state memory media than current NAND flash memory. By extension, this means that system latencies will weigh in heavier in the overall latencies, and therefore alternative system interface technologies will become more attractive. For example, instead of using translation of logical block addresses into NAND flash memory pages using a flash translation layer (FTL) of an SSD controller, it may become standard to use a memory-based file system using a virtual address space similar to that already employed in system dynamic random access memory (DRAM). Depending on the performance characteristics of the memory technology used, it is assumed that stand-alone devices may function very well on existing bus interface technologies. However, if several devices are used in parallel, it is foreseeable that bus and protocol overheads like hand-shake and device arbitration may cause problems in the form of bus contention.
A similar development has occurred in the case of graphics adapters in order to split the workload over two or more different graphics processing units (CPUs). This technology was first introduced as Scan-Line Interleave (SLI) technology by 3Dfx Interactive, Inc., and is currently available in the market as Scalable Link Interface (SLI) through nVidia Corporation and AMD CrossFireX through Advanced Micro Devices, Inc. Briefly, in both cases, the data transfer to the graphics cards is still done exclusively via the system bus interface. However, up to four devices communicate with each other through a dedicated ancillary bus. For example, in the AMD CrossFireX product, the dedicated ancillary bus is connected to a CrossFireX compositor logic. In that particular situation it is critical that the individual graphics processing units know the presence and detailed timing of each other in order to avoid dropping frames, scrambling of displayed images by out of order signaling to the monitor, or by generating partially overlapping images that could lead to tearing and distortion of the displayed images. In general, all of the internal communication could theoretically be done via the system bus. However, it is generally accepted that doing so would result in bus congestion and unnecessary latencies in the cross-talk between the cards. Consequently, a valid signal between two or more compositors is necessary in order to enable AMD CrossFireX on the driver level.
Aside from signaling each card's presence to the others, the main benefit of the ancillary bus is to allow direct communication and synchronization of the different devices, thus not only shortening latencies but also relieving the system bus of timing signals that would tie up bandwidth. In a dual card configuration, the benefit for the integration of both devices into a single logical unit may be merely ease of integration, but the improvement becomes more tangible with every additional device.
Advanced non-volatile memory technologies that are capable of faster load and store operations than NAND flash memory are currently emerging. Therefore, it appears that a solution is needed that is similar to that employed in graphics subsystems. Specifically, in the case of multiple storage cards used in an array similar to that currently known as RAID, it would be advantageous for different cards to talk directly to each other, for example, to confirm filling of a cache line or commitment of a set of data to the memory array, or to distribute parity data across the different members of the array without congesting the system interface bus. Of particular interest would be to use locally generated error checking and correction (ECC) algorithms (also known as error correction code), such as Reed Solomon (R-S) or Bose-Ray-Chaudhuri-Hocquenghem (BCH) or low density parity check (LDPC), to generate a checksum of the data during writes and then cross-reference the checksum on subsequent reads with a recalculated checksum of the data. In the interest of fail-over mechanisms, the checksum could be written to a different card than that storing the data using a dedicated bus.