1. Field of the Invention
The present invention relates to a transfer of data between a Central Processing Unit (CPU) and an Input/Output (I/O) device configured for communication with the CPU based on a system memory and a data bus.
2. Background Art
Existing microprocessor based computer systems typically utilize an internal clock signal that runs substantially faster than an external system bus clock. For example, such systems typically include a central processing unit, also referred to as a processor, a system memory, an input/output (I/O) device, for example a network interface, and a peripheral bus enabling the I/O device to access the system memory. The peripheral bus, for example a Peripheral Component Interconnect (PCI) bus, is substantially slower than the local buses utilized by the CPU for accessing the system memory. Hence, the CPU can execute instructions from its internal cache memory faster than accessing peripheral devices via the peripheral bus. Consequently, an I/O read operation is particularly expensive in terms of execution time (i.e., CPU execution clock cycles), since the CPU may need to wait for the I/O data to be retrieved before the CPU can execute the next instruction. In contrast, an I/O write operation typically is not as expensive as a read operation, since the CPU can start the write operation and then continue executing the next instruction without waiting for the I/O transfer to be completed.
Hence, there is a desire to reduce the number of I/O read accesses needed to manage a peripheral device in order to improve the efficiency of the system and the performance of the device.
Direct Memory Access (DMA) has been used to transfer large amounts of data to and from system memory to reduce the number of I/O instructions that the CPU must execute to manage the data transfer. For example, a network packet transmission begins by the CPU creating a transmit descriptor in system memory that includes the location and length of the block of transmit data to be transferred. The CPU then writes a single command to the I/O device (in this case the network interface device) to notify the I/O device to start the transfer. A DMA controller in the I/O device, acting as a system bus master, retrieves the transmit descriptor from system memory to determine the location of the transmit data. The DMA controller then copies the transmit data from the system memory to a transmit buffer in the I/O device via the PCI bus. Once the data transfer is completed, the I/O device generates an interrupt to notify the CPU that the data transfer is complete.
The CPU typically responds to the interrupt by performing an I/O read access on the peripheral bus to read the interrupt status register in the peripheral device. The interrupt status register typically contains an array of bits that indicates which of the several types of events caused the interrupt. After reading the interrupt status register, the CPU writes to the I/O device via the peripheral bus to clear the interrupt condition, enabling the I/O device to assert another interrupt once another interrupt condition has occurred and interrupts have been enabled by the CPU.
A bulk read operation, for example a network data packet reception, is executed based on the CPU creating one or more receive descriptors in system memory before the data packet arrives, and writing to the I/O device to indicate that the receive descriptor or list of receive descriptors is available. The DMA controller in the I/O device retrieves the receive descriptor, and waits for a packet to arrive. Upon reception of a data packet, the DMA controller in the I/O device transfers the packet data to the system memory location specified by the receive descriptor, and generates an interrupt for the CPU. The CPU services the interrupt by first performing an I/O read access via the peripheral bus to read the interrupt status register; the CPU then performs an access to clear the interrupt condition.
The necessity of a read access by the CPU via the peripheral bus for interrupt servicing substantially reduces the efficiency of the CPU, since the CPU needs to wait for the I/O data to be retrieved via the peripheral bus before the CPU can execute the next instruction. For example, peripheral buses such as existing PCI-X buses have a clock speed of 133 MHz on a 64 bit (or 32 bit) bus, whereas processor clock speeds exceed 1 GHz. Moreover, the minimum PCI transaction is four clock cycles at 133 megahertz. Hence, a single I/O read by the CPU via the peripheral bus requires the CPU to wait a substantially large number of CPU clock cycles that otherwise could be used for execution of instructions. This delay becomes even more substantial in multiprocessor systems, where one processor may attempt to access a device that is configured for communication with another processor via an associated peripheral bus. Hence, the delay encountered by a CPU during an I/O read operation can vary for example from 100 nanoseconds to 3 or 4 microseconds.
There is a need for an arrangement that enables a CPU to minimize the necessity of I/O read operations required to manage an input/output device via a peripheral bus.
There also is a need for an arrangement that enables a CPU to service an interrupt generated by a peripheral device, without requiring a CPU read operation of an interrupt status register via a peripheral bus.
These and other needs are obtained by the present invention, where an I/O device configured for accessing a system memory via a peripheral bus minimizes I/O read accesses required by a CPU, by copying an interrupt status value from its interrupt register to a prescribed location in the system memory. Once the interrupt status value is copied into system memory, the I/O device generates an interrupt to notify the CPU of an interrupt condition requiring servicing. Hence, the interrupt status value stored in system memory enables the CPU to service the interrupt based on reading the interrupt status value from system memory, eliminating the necessity of performing an I/O read operation of the interrupt register within the I/O device via a peripheral bus.
Hence, I/O read operations can be minimized, and even eliminated for interrupt servicing, substantially improving the CPU utilization during interrupt servicing.
One aspect of the present invention provides a method in a computing system having a central processing unit (CPU), a system memory, and an Input/Output (I/O) device configured for accessing the system memory via a peripheral bus. The method includes updating by the I/O device an interrupt status value of an interrupt register within the I/O device based on the I/O device detecting at least one interrupt condition, and copying the interrupt status value by the I/O device to a prescribed location in the system memory. The method also includes generating by the I/O device an interrupt for notifying the CPU of the at least one interrupt condition, and servicing the interrupt by the CPU reading the interrupt status value from the prescribed location in the system memory.
Another aspect of the present invention provides an Input/Output (I/O) device controllable by a Central Processing Unit (CPU) and configured for accessing a system memory via a peripheral bus. The I/O device includes an interrupt register configured for storing an interrupt status value, and interrupt logic. The interrupt logic is configured for updating the interrupt status value in the interrupt register and copying the interrupt status value to a prescribed location in the system memory based on detecting at least one interrupt condition. The interrupt logic also is configured for generating an interrupt upon copying the interrupt status value to the prescribed location, enabling the CPU to service the interrupt based on reading the interrupt status value from the prescribed location.
Additional advantages and novel features of the invention will be set forth in part in the description which follows and in part will become apparent to those skilled in the art upon examination of the following or may be learned by practice of the invention. The advantages of the present invention may be realized and attained by means of instrumentalities and combinations particularly pointed in the appended claims.