1. Field Of The Invention
This invention relates to computer systems, and more particularly, to methods and apparatus for accelerating the transfer of data to be utilized by a computer input/output (I/O) device.
2. History Of The Prior Art
In computers running modern multitasking operating systems, it has typically been necessary to call the operating system to write any data to memory-mapped input/output devices. This has been required to assure that operations conducted by the application programs are safe and do not write over the assets of the system or other application programs. Consequently, in order to display graphics data on a computer output display, the operating system has typically conducted the transfer. This is a very slow process because it is complicated and not conducted in hardware. With the emergence of multimedia programming, the process has become too slow.
Recently, a new I/O architecture has been devised which allows direct writes by an application program to an I/O control unit which resides with and controls data transfers to I/O devices in a multitasking operating system. The I/O control unit assures that only operations which are safe are sent to I/O devices thereby allowing an application program to bypass the security furnished by the operating system without endangering the operation of the system or the assets of other applications. The architecture utilizes hardware to accomplish its operations and thus makes writing to I/O devices very much faster than prior art architectures by eliminating the very long times required to write utilizing the operating system.
When writing to an I/O device using the new architecture, an application program executing on a computer central processor causes commands including an address and data to be sent to the I/O control unit for transfer to the I/O device joined to the control unit. Since an application can know only virtual addresses without operating system assistance, the I/O control unit must furnish the physical address for the I/O device and assure that the operation is safe. Once the physical address has been determined, it is held in a register on the I/O control unit so that all subsequent commands including the same virtual address are sent directly to the selected I/O device.
Data transfers from an application program to I/O devices in computer systems utilizing memory-mapped I/O are typically handled by the memory control unit once generated by the central processor. The application indicates to the processor where the desired data resides in memory, the extent of the data, and the address to which it is to be transferred. The bus control unit receives the data, acquires the system bus, and transfers the data over the system bus to the I/O control unit. This allows the central processor to attend to other operations while the data is being transferred. When a significant amount of data is involved, the bus control unit transfers small increments of the data at a time over the bus to the I/O control unit and repeats the process until all of the data has been transferred to the I/O control unit.
In order to assure that data will be available to an I/O device without delay, the new architecture includes a relatively large input buffer on the I/O control unit which controls the writes to the graphics accelerator or other I/O device. This first-in first-out (FIFO) buffer allows large amounts of data to accumulate from a myriad of small transfers from the bus control unit so that the accelerator does not have to wait for each new transfer before it can proceed. Such a solution accelerates the transfer of data from the processor to the graphics accelerator significantly by reducing the need for either the central processor or the graphics accelerator to wait for the other in order to continue with operations. The new architecture including such a FIFO input buffer is described in U.S. Pat. No. 5,696,990, entitled Method and Apparatus for Providing Improved Flow Control For Input/Output Operations a Computer System Having a FIFO Circuit And An Overflow Storage Area, issued Dec. 9, 1997, to Rosenthal et al.
A hardware buffer is expensive and must be finite in size; consequently, an input buffer of 128 bytes has been selected as a useful compromise for typical uses. However, where large amounts of data are being transferred as in graphics operations, it is necessary to monitor the condition of the FIFO input buffer in order to guard against overflow. If the FIFO input buffer overflows in a system such as described in which the central processor is decoupled from I/O devices, the data being transferred will be lost. For this reason it has been necessary to provide a means to indicate to the central processor when the FIFO is able to receive additional data. To accomplish this, the I/O control unit includes circuitry which keeps track of the FIFO buffer space available and furnishes this information in a local register on the I/O control unit. The central processor reads the register for the condition of the FIFO buffer before sending any new sequence of data to an I/O device. The need for the central processor to read the amount of space available in the FIFO buffer before sending any additional data slows the transfer of the graphics data to I/O devices significantly.
This arrangement has recently been improved to allow more rapid writing of data to I/O devices. The improved arrangement utilizes a first data structure to establish a very large variable-sized buffer in main memory to store data being transferred to I/O devices and a second data structure to establish a second buffer in main memory in which a notification may be placed to indicate the completion of a write operation. The arrangement utilizes a direct memory access (DMA) engine having a series of registers which an application uses to indicate a portion of the buffer which contains data to be moved to the I/O device and the extent of the data to be moved. The DMA engine keeps a reference value to find the first data structure indicating the memory buffer from which the data stored is to be transferred. When the transfer is complete, the DMA engine uses another reference value referring to the second data structure to place a notification that the operation is complete in the notification memory area and signals the processor to review the status of the transfer.
The arrangement allows very large increments of data to be rapidly transferred to I/O devices safely without involving the central processor to any significant extent. However, in some modern interfaces, a very large number of smaller increments of data must be transferred in order to make up the total amount of data being transferred. In such situations, the amount of time required to accomplish the set up of the DMA engine for each transfer becomes a very significant portion of the transfer time. The set up time significantly slows the transfer operation.
It is desirable to increase the speed at which graphics data may be transferred from memory to a graphics accelerator while freeing the central processor for other activities.