A direct memory access (DMA) controller is an apparatus which transfers a specified amount of data between a peripheral device, such as a hard disk or video card, and the system memory of a computer system in response to a "transmit" command. Referring to FIG. 1, a computer system comprises host central processing unit (CPU) 10, host to bus/memory bridge 20, system memory 30, system bus 40, and peripheral device 50. System bus 40 couples peripheral device 50 to bridge 20, and bridge 20 couples host CPU 10 to system memory 30 and system bus 40.
To initiate a DMA operation, a device driver, executing on host CPU 10, sets up a control/data list in system memory 30, specifying the data to be transferred and optional control information. After setting up the control/data list, the device writes information about the memory location of the control/data list in registers of the DMA controller and a transmit command as a control signal through bridge 20 and system bus 40 to the DMA controller of peripheral device 50. In response to the control signal, the DMA controller reads the control/data list in system memory 30 and begins the DMA transfer.
If the peripheral device is a graphics processing subsystem, an associated graphics device driver commonly initiates a series of DMA transfers. In order to start the next DMA transfer, the graphics device driver must be informed when the DMA transfer is completed. As a result, a high-performance graphics device needs to reduce the latency between the end of one DMA transfer and the beginning of the next DMA transfer as much as possible.
A conventional method to inform a device driver that a DMA transfer is completed is the interrupt method. After the DMA transfer is complete, the DMA controller sends a special, asynchronous signal with the system bus to an interrupt controller sitting on system bus 40 (not shown). In response, the interrupt controller signals host CPU 10, which causes host CPU 10 to perform a context swap, save state information, and execute reads to the interrupt controller using special cycles to determine an interrupt vector.
Host CPU 10 uses the interrupt vector to jump to an interrupt handler. The interrupt handler informs the device driver typically by setting a flag in system memory which the device driver is repeatedly checking. After the flag is set, the interrupt handler terminates and control of the host CPU is restored to the device driver through another context swap. A major disadvantage of the interrupt method, however, is that executing all the operations in the interrupt method is expensive, taking many microseconds to perform. In a computer system with high-performance graphics processing, such a performance penalty is a substantial drawback.
In another conventional method, the polling method, the device driver repeatedly reads a status register on the peripheral device to determine if the DMA operation has completed. Although the polling method theoretically promises a quicker determination that the DMA transfer is completed, the repeated reads of the status register of the peripheral device interfere with efficient operation of many standard system buses, such as PCI or AGP buses.
Such standard system bus architectures are specifically designed with pipelines and buffers to process a large amount of data within a very short amount of time. When the device driver reads the status register through the system bus, the system bus must flush its write buffers to the destination and invalidate its read buffers, impairing the performance of the ongoing DMA transfer.
Other conventional methods employ specialized hardware which bypasses the standard system bus to achieve superior performance. However, such specialized hardware increases the cost of computer systems.
Therefore, there is a need for a high-performance DMA communications protocol using a standard system bus.