The end-to-end performance of IP networks for bulk data transfer may be limited by data copying overhead in end systems. Even when end systems may be able to sustain the bandwidth of high-speed networks, copying overheads may limit their ability to carry out other processing tasks. Remote direct memory access (RDMA) is a protocol, which may be utilized to avoid copying for network communication. The RDMA protocol may be useful for systems and/or protocols that transmit bulk data mixed with control information, such as network file system (NFS), common Internet file system (CIFS), hyper text transfer protocol (HTTP), or encapsulated device protocols such as Internet small computer systems interface (iSCSI). The Internet and IP networks may be utilized for buffer-to-buffer transfers, for example, in the form of file or block transfers by utilizing a variety of protocols, for example, HTTP, file transfer protocol (FTP), NFS, and CIFS. These upper-layer protocols (ULPs) may be enabled to transmit data that are uninterrupted by the protocol or the network. Each ULP may have a plurality of ways of requesting and initiating data transfers. For example, one use of HTTP may be to transfer JPEG format graphic images from a web server to a web browser's address space.
The data may be placed in the correct memory buffer directly as it arrives from the network, avoiding the need to store the data and subsequently recopying it into the correct buffer after it has arrived. If the network interface card (NIC) can place data correctly in memory, the memory bandwidth may be freed and the CPU cycles consumed by copying may be reduced. A number of mechanisms already exist to reduce copying overhead in the IP stack. Some of these mechanisms depend on fragile assumptions about the hardware and application buffers, while others involve ad hoc support for specific protocols and communication scenarios.
The RDMA protocol offers a solution that is simple, general, complete, and robust. The RDMA protocol introduces new control information into the communication stream that directs data movement to facilitate buffer-to-buffer transfers. Incorporating support for RDMA into network protocols may significantly reduce the cost of network buffer-to-buffer transfers. The RDMA protocol accomplishes exact data placement via a generalized abstraction at the boundary between the ULP and its transport, allowing an RDMA-capable NIC to recognize and steer payloads independently of the specific ULP. By using RDMA, the ULPs may gain efficient data placement without the need to program ULP-specific details into the NIC. The RDMA protocol speeds deployment of new protocols by not requiring the firmware or hardware on the NIC to be rewritten to accelerate each new protocol. To be effective, the receiving NIC may recognize the RDMA control information, and ULP implementations or applications may be modified to generate the RDMA control information. In addition, support for framing in the transport protocols may allow an RDMA-capable NIC to locate RDMA control information in the stream when packets arrive out of order.
Direct memory access (DMA) is a technique that is widely used in high-performance I/O systems. DMA allows a device to directly read or write host memory across an I/O interconnect by sending DMA commands to the memory controller. No CPU intervention or copying is required. For example, when a host requests an I/O read operation from a DMA-capable storage device, the device may use a DMA write to place the incoming data directly to memory buffers that the host provides for that specific operation. Similarly, when the host requests an I/O write operation, the device may use a DMA read to fetch outgoing data from host memory buffers specified by the host for that operation.
The RDMA protocol specifies dividing of long RDMA messages into multiple segments. These segments may reference virtually contiguous memory at their destination. The operating systems may prefer to allocate physically contiguous pages for virtually contiguous address ranges. With a conventional page buffer list (PBL) organization, an RDMA enabled NIC (RNIC) may make multiple successive lookups to the PBL for pages that may be physically contiguous.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.