1. Field of the Invention
This invention relates generally to methods and apparatus for transferring data between network devices. In particular, the present invention relates to methods and apparatus for reading a remote memory across a network.
2. Description of the Related Art
Conventional servers typically have multiple input/output (I/O) controllers, each supporting at least one I/O device, and a memory mapped load/store configuration. In the example of FIG. 1, there are a plurality of central processing units CPU1–CPUn, a host system bus and a system memory SM connected to the host system bus via a memory controller MC. An I/O bridge connects the memory controller MC to an I/O bus such as, for example, a Peripheral Component Interconnect (PCI) synchronous bus as described in the latest version of “PCI Local Bus Specification, Revision 2.1” set forth by the PCI Special Interest Group (SIG) on Jun. 1, 1995. Connected to the I/O bus are an I/O controller A (usually implemented as a slot-based adaptor card) for a hard disk (not shown), an I/O controller B (usually implemented as a slot-based adaptor card) for a CD-ROM drive (not shown) and a network interface controller (NIC).
Suppose, for example, that CPU1 wishes to transfer data to the hard disk via I/O controller A as shown in FIG. 2. CPU1 first stores the write command and its associated data within a block within the system memory SM. CPU1 transfers a command to the I/O controller A via a path over the system bus, I/O bridge, and I/O bus. This tells the I/O controller A that a new command has been issued. I/O controller card A must then read the data from system memory SM using a pointer, which is the value representing an address within the system memory SM where the data associated with the command can be found. (The pointer may be virtual or physical and the location of the data is not necessarily contiguous with the location of the command. Indeed, the data may be split, requiring a Scatter/Gather List (SGL) to describe the locations of the data.) To get the block of data from the system memory SM back to I/O controller A may require several separate fetches. The data is then subsequently written from the I/O controller A to the hard disk HD. The CPU, must always load the data and the I/O controller must always separately read the write command to know where the data is located and perform the fetches to obtain the data. A similar load/store procedure occurs when a CPU reads a block of data from the hard disk, i.e., the I/O controller A would store the block of data in a block within the system memory SM, then pass an indication to the CPU that the read process from the hard disk HD has been finished, whereupon the CPU must separately access the block within the system memory SM to obtain the data.
This conventional load/store procedure (illustrated generally in FIG. 3) of sending a command with pointer (step 1), waiting for and receiving a request for data (step 2) and subsequently sending the data in response to the request (step 3) has substantial inherent latencies and delays. Even though the CPUs perform optimally, the performance of the server can still be less than optimum because the procedure is very inefficient. The data transfers slow down the entire system and many CPU cycles will pass before they are completed. Although, the PCI bus architecture provides the most common accepted method used to extend computer systems for add-on arrangements (e.g., expansion cards) with new disk memory storage capabilities, it has performance limitations and scales poorly in server architectures. Furthermore, a server may have a significant number of I/O devices which are of radically different types, store different kinds of data and/or vary from each other in the addressing sequence by which the data blocks containing the data are written and read out.
A data transfer from another device across a network is similarly made without direct reference to the system memory. A network interface controller (NIC) acts as the communications intermediary between the device and the network and passes data blocks to and from the network in the speed and manner required by the network. The data transfer between the devices over the network is virtualized into a pair of starting and ending points corresponding to the NIC for each of the devices. Other parts of the devices, such as the I/O controllers and memory controller which controls the writing and reading of the transferred data blocks to and from the device memory, are not involved when the data is transferred between the NICs across the network. Furthermore, although not shown in FIGS. 1 and 2, transport and other protocols (e.g., TCP, IP) are implemented at various levels of firmware and software in the device to control, distinguish or review the transferred data in order to render the transfer of data over the network more reliable. The multiplexing and demultiplexing processes are computationally expensive and a CPU must control the movement of the transfer data blocks into and out of the memory controller or I/O controller during the transfer of each data block. Also, an intermediate copy of the data must be made in the hardware of the memory controller or I/O controller and at other levels or layers, mode switches and context switches of the device.