The present invention is directed to the communication of data within a network, especially a network of computing devices.
An important factor that affects the speed of processing within a computer network is the speed at which data is communicated between computers. Data communication speed is especially important in more complex networks in which clusters of smaller networks of computers are linked to each other and share certain common resources. The speed at which data is communicated in such networks impacts the availability of common resources, in turn affecting processing speed and potentially the integrity of the data available to the computers in the network.
Conventionally, data is transferred between the nodes of a network in whatever quantity is requested by a requesting agent, an upper layer protocol being an example of such agent. In some networks, when the requested quantity is a unit size quantity of data, the data is transferred to the network in a relatively quick manner because of predetermined procedures for handling such transfers in unit size quantities. However, when the requested quantity is not a unit size, the available procedures for handling such transfers may cause the data to be transferred more slowly to the network. Non-unit size data transfers contribute to latency, which can be considered idle time or wasted time for the system, when processing at a receiving node awaits receipt of data from another node. In a network, latency is the amount of time it takes data to transmit and receive a user message having zero length. A zero length message has a protocol header, so the actual number of bytes transferred is greater than zero but the header still has a smaller number of bytes than a cache line size, which is typically 128 bytes. On the other hand, bandwidth is the amount of data that can be transmitted and received per unit of time. Bandwidth is particularly important for the transfer of data between devices, e.g., nodes of a network.
Data transfers in a network can be categorized as either a memory transfer or an input/output (I/O) transfer. In either case, a common system or network bus may be used to route the data. Since a bus is a shared resource, latency affects performance.
FIG. 1 illustrates a data transfer method provided by a prior art computer network. For simplicity, only two nodes 210 and 220 of the network are illustrated. Nodes 210 and 220 have associated network adapters 231 and 232, respectively, and user buffers 241 and 242, respectively. An upper level protocol on the first node 210 directs data transfers from a user buffer 241 on the first node 210 to a user buffer 242 on the second node 220. This is accomplished using a data staging buffer 251 managed by network adapter 231 on the first node 210 to transfer the data to another like data staging buffer 252 managed by network adapter 232 on the second node 220. Similarly, an upper level protocol on the second node 220 directs data transfers from a user buffer 242 on the second node 220 to a user buffer 241 on the first node 220. In like manner to the above-described transfer, this is also accomplished using a data staging buffer 252 managed by network adapter 232 on node 220, and using the data staging buffer 251 managed by network adapter 231 on the first node 210. In the particular arrangement shown, data staging buffer 251 includes one sending buffer 261 having a first-in-first-out (FIFO) organization (hereinafter “SEND FIFO”), and one receiving buffer 271 having a FIFO organization (hereinafter “RECEIVE FIFO”). Likewise, the data staging buffer 252 of node 220 includes a SEND FIFO buffer 262 and a RECEIVE FIFO buffer 272.
When a particular amount of data smaller than a standard size is to be transferred from node 210 to node 220, for example, when the data is smaller than the size of a cache line, the upper level protocol (ULP) operating on node 210 requests that the particular amount of data be transferred to node 220. For example, assume that the particular amount of data to be transferred is 109 bytes, while the cache-line size is 128 bytes. As part of the transfer process, the requested amount (109 bytes) of data is copied from the user buffer 241 to the SEND FIFO buffer 261. The network adapter 231 then copies the requested amount of data from the SEND FIFO buffer 261 into a memory 265 of its own. The network adapter 231 may begin copying the data to its memory 265 before all of the data has been copied from the user buffer 241 to the SEND FIFO 261. Once some of the data is available in the adapter memory 265, the adapter 231 then transfers the data to the adapter at the receiving end in any of several available ways for sending data having length smaller than a cache line. Unfortunately, such methods of transferring data can actually take longer to transmit the data than is true when the data is transferred in other than an integral number of units of the data. This is especially so if the data transfer operation is interrupted in progress.
Latency within the network is impacted when the time required for transferring data between nodes is increased, as here. To the node that awaits the transferred data, latency unnecessarily causes delays in processing, since the node awaiting the transferred data cannot either begin processing or continue processing until the transferred data arrives. In addition, the bandwidth for transferring data across the network appears lower when the total amount of time it takes to transfer the data is higher than it is for transferring an integral number of units of the data. Consequently, a need exists for an improved system and method having improved efficiency for transferring data of non-standard size, e.g., non-cache line aligned data, between nodes of a network, to permit a reduction in latency and an increase in bandwidth for such transfers.