Solid-state drives (SSDs) are rapidly becoming main storage elements of modern datacenter infrastructure quickly replacing traditional storage devices such as hard disk drives (HDDs). SSDs offer low latency, high data read/write throughput, and reliable persistent storage of user data. Non-volatile memory express (NVMe) over fabrics (NVMe-oF) is an emerging technology that allows hundreds and thousands of SSDs to be connected over a fabric network such as Ethernet, Fibre Channel, and Infiniband.
The NVMe-oF protocol enables remote direct-attached storage (rDAS) allowing a large number of NVMe SSDs to be connected to a remote host over the established fabric network. The NVMe-oF protocol also supports remote direct memory access (RDMA) to provide a reliable transport service to carry NVMe commands, data, and responses over the network. iWARP, RoCE v1, and RoCE v2 are some examples of the transport protocols that provide an RDMA service.
A data storage system using disaggregated data storage devices (e.g., NVMe-oF-compatible SSDs, herein also referred to as NVMe-oF SSDs or eSSDs in short) can provide a large storage capacity to an application running on a host computer. The application can collect a large amount of data (big data) from the disaggregated data storage devices and analyze them.
Since the scale of big data processing is very large, the infrastructure to perform meaningful big data mining can be cost prohibitive, requiring heavy computing resources, large system memories, a high bandwidth network, as well as large and high-performance data storage devices for storing the big data. It would be desirable to offload some data processing mining tasks from the host computer to the data storage devices and minimize data movements from the data storage devices to the host computer.