1. Technical Field
The present invention relates generally to communication protocols between a host computer and an input/output (I/O) device. More specifically, the present invention provides a hardware implementation for offloading management of a receive queue. In particular, the present invention provides a mechanism by which work requests are turned into work queue entries (WQEs) and are passed from Upper Layer Protocol (e.g. sockets) software to an Internet Protocol (IP) Suite Offload Engine (IPSOE). The present invention also provides a mechanism by which completed WQEs are passed back to the Upper Layer Protocol (ULP) software. The present invention also provides a mechanism for supporting Selective Acknowledgements. Finally, the present invention provides a mechanism by which an IPSOE can be shared between virtual hosts of a single physical host.
2. Description of Related Art
In an Internet Protocol (IP) Network, the software provides a message passing mechanism that can be used to communicate with input/output devices, general purpose computers (host), and special purpose computers. The message passing mechanism consists of a transport protocol, an upper level protocol, and an application programming interface. The key standard transport protocols used on IP networks today are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP provides a reliable service and UDP provides an unreliable service. In the future the Stream Control Transmission Protocol (SCTP) will also be used to provide a reliable service. Processes executing on devices or computers access the IP network through upper level protocols, such as Sockets, iSCSI, and Direct Access File System (DAFS).
Unfortunately, the TCP/IP software consumes a considerable amount of processor and memory resources. This problem has been covered extensively in the literature (see J. Kay, J. Pasquale, “Profiling and reducing processing overheads in TCP/IP”, IEEE/ACM Transactions on Networking, Vol 4, No. 6, pp. 817-828, December 1996; and D. D. Clark, V. Jacobson, J. Romkey, H. Salwen, “An analysis of TCP processing overhead”, IEEE Communications Magazine, volume: 27, Issue: 6, Jun. 1989, pp 23-29). In the future the network stack will continue to consume excessive resources for several reasons, including: increased use of networking by applications; use of network security protocols; and the underlying fabric bandwidths are increasing at a higher rate than microprocessor and memory bandwidths. To address this problem, the industry is offloading the network stack processing to an IP Suite Offload Engine (IPSOE).
There are two offload approaches being taken in the industry. The first approach uses the existing TCP/IP network stack, without adding any additional protocols. This approach can offload TCP/IP to hardware, but unfortunately does not remove the need for receive side copies. As noted in the papers above, copies are one of the largest contributors to central processing unit (CPU) and memory bandwidth utilization. To remove the need for copies, the industry is pursuing the second approach that consists of adding Framing, Direct Data Placement (DDP), and Remote Direct Memory Access (RDMA) over the TCP and SCTP protocols. The IP Suite Offload Engine (IPSOE) required to support these two approaches is similar, the key difference being that in the second approach the hardware must support the additional protocols.
The IPSOE provides a message passing mechanism that can be used by sockets, Internet Small Computer System Interface (iSCSI), Direct Access File Systems (DAFS), and other Upper Layer Protocols (ULPs) to communicate between nodes. Processes executing on host computers, or devices, access the IP network by posting send/receive messages to send/receive work queues on an IPSOE. These processes also are referred to as “consumers”.
The send/receive work queues (WQ) are assigned to a consumer as a queue pair (QP). The messages can be sent over three different transport types: traditional TCP, RDMA TCP, UDP, or SCTP. Consumers retrieve the results of these messages from a completion queue (CQ) through IPSOE send and receive work completion (WC) queues. The source IPSOE takes care of segmenting outbound messages and sending them to the destination. The destination IPSOE takes care of reassembling inbound messages and placing the inbound messages in the memory space designated by the destination's consumer. These consumers use IPSOE verbs to access the functions supported by the IPSOE. The software that interprets verbs and directly accesses the IPSOE is known as the IPSO interface (IPSOI).
Today the host CPU performs most IP suite processing. IP Suite Offload Engines offer a higher performance interface for communicating to other general purpose computers and I/O devices. Data sends or receives through the IPSOE require that the CPU either copy data from one memory location to another or register the memory so that the IPSOE can directly access the memory region. Each of these options requires significant CPU resources with the memory registration option being preferred for large memory transfers, however, as network speeds increase the amount of CPU resources required will increase. A simple mechanism is needed to implement Receive Queue in the IPSOE and perform RDMA, DDP, framing, and TCP/IP processing in the IPSOE. The mechanism needs to maintain all RDMA, DDP, framing, TCP, IP, and Ethernet state in the IPSOE. It must also provide the necessary protection to support out of user space Receive Queue operations. The present invention also provides a mechanism for supporting Selective Acknowledgements. Finally, the present invention provides a mechanism by which an IPSOE can be shared between virtual hosts of a single physical host.