Included by reference herein, in their entirety and for all purposes, are the following microfiche appendices:
Appendix A Information Technologyxe2x80x94SCSI Architecture Modelxe2x80x942 (SAM-2) (2 Sheets Microfiche, 99 Frames)
Appendix B SCSI/TCP (SCSI Over TCP) (1 Sheet Microfiche, 48 Frames)
Appendix C A Common Internet File System (CIFS/1.0) Protocolxe2x80x94Preliminary Draft (2 Sheets Microfiche, 123 frames)
1. Field of the Invention
The present invention relates to data transfer techniques, in particular DMA techniques for use in internetworking.
2. Description of the Related Art
Direct memory access (DMA) is a well-known method of moving data between a disk or other storage system and memory by direct transfer without first copying it into processor memory.
Various types of input/output (I/O) access have been provided over computer networks for many years. These systems, which typically use technologies such as disk file or tape systems, have suffered from the overhead of the network protocol processing needed to read and copy the data from the source system, re-format the copy, and transmit the reformatted data to the receiving system. At a minimum, prior data transfers across networks have typically required copying the data in order to move it to another location after reception.
As networks move to ever-higher data rates in the megabits to gigabits per second (Mbps, Gbps) and beyond, the speed of the networks has made the centralization of storage in remote sites more feasible. However, such storage centralization and the necessary data transfer requirements have exposed the extra memory copies required by conventional network communication protocol implementations as a significant and unacceptable cost.
Networked storage data transfers are highly desired by users of storage systems. Utilizing current networking protocols in these data transfers, however, incurs high overhead costs because the endpoint in the network transfer is forced to make an extra copy of some or all of the data. As the number of blocks received per second increases, the amount of copying delay and thus overhead required to handle each block increases dramatically because each copy in a chain of copies is increased in size.
To date, the response to the problem of unacceptable overhead requirements in network remote DMA (RDMA) has been to invent entirely new protocol architectures. The logic behind these new protocol architectures, which include Fibre Channel, NGIO, Future I/O , and System I/O, and InfiniBand, has been to re-engineer the entire communications protocol to focus specifically on the RDMA task. These new architectures have also been justified by citing unspecified xe2x80x9cperformance issuesxe2x80x9d with existing protocol suites and, in particular, the TCP/IP protocol suite.
What is needed is a remote direct memory access technique that leverages from existing protocol architectures in a way that greatly reduces the amount of data copying needed to transfer large blocks of data across the network. Such an RDMA technique must also avoid (or at least minimize) modifications to the installed network hardware and software base.
The present invention is a shim protocol laid atop an existing network data transfer protocols, in particular TCP, but logically underneath the higher level disk and file access protocols. The shim protocol specifies the portion of the data packet to be transferred to a separate area of memory, such as an application layer buffer. The protocol also identifies the area of memory into which the data should be delivered, a data ID, data start, data length or end, and flag bits. While this invention can be embodied in an adaptation of the well-known TCP protocol, it is not necessarily limited to implementation within the TCP protocol, but may be used in conjunction with other protocols and variations on conventional protocols.
In one embodiment of the present invention, a network interface device implements a transport protocol including the RDMA shim protocol. As will be made apparent below, the shim protocol of the present invention can be implemented using option fields added to (or already present in) an existing transport protocol. Drivers within the device transmit packets containing an RDMA description according to the high level or overlying protocol described at the shim layer. On reception of a packet specifying RDMA, the receiving device is able to deliver the data directly into the correct memory area or buffer as is commonly performed by conventional, local DMA operations.
In some embodiments of the present invention, the RDMA shim protocol is implemented with TCP options specifically introduced to enable RDMA and thus reduce the overhead of transferring and receiving data with a TCP-based protocol such as NFS or HTTP. Use of a TCP option technique enables the construction of simple hardware accelerators to copy data directly from the incoming packet into application memory buffers thus avoiding expensive copies within the protocol stack. Alternatively, software techniques may be used to perform direct copying into the application memory space, for instance a copy into an application layer buffer.