1. Field of the Invention
This invention relates in general to direct memory access (DMA), and more particularly to a method, apparatus and program storage device for enabling multiple asynchronous direct memory access task executions.
2. Description of Related Art
In digital computer systems, it is common to use direct memory access (DMA) to transfer data between a system memory attached to a main system bus and input/output (I/O) devices. The direction of data transfer can be from the I/O device to memory, or vice versa. A DMA controller is generally used to transfer blocks of data between an I/O device and consecutive locations in the system memory. A DMA engine is a hardware facility to perform data transfer without using the CPU's processing power.
In order to perform a block transfer, the DMA device needs a source, destination address, control flags to indicate for example direction of the transfer etc., and a count of the number of data items, which may be bytes, words, or other units of information that can be transmitted in parallel on the computer system bus. One simple method by which a DMA controller operates involves a host processor writing directly into the DMA controller to request a block transfer. The host processor must continuously monitor the DMA engine to determine when the transfer completes before requesting a new transfer, leading to an inefficient use of processor time.
Sophisticated DMA controllers typically use a linked list of control blocks in a memory to chain a sequence of DMA operations together. The control blocks, each of which conveys data-transfer parameters between a host processor and DMA controller, are data structures created by the host processor and accessed by the DMA controller for effecting a particular DMA operation.
Often, while the DMA controller is performing a data transfer specified by a particular control block, the host processor specifies additional data transfers by creating additional control blocks. When additional control blocks are created, it is desirable to append the new control blocks to the existing linked list of control blocks to allow the DMA controller to process all the control blocks in one uninterrupted sequence of data transfer operations.
The appending of control block(s) to an existing linked list before completion of a corresponding DMA operation is referred to as dynamic chaining of DMA operations. The transfer of high-speed streaming data (such as multimedia data in storage and network technologies) requires frequent dynamic DMA chaining.
In a DMA engine, microcode builds DMA descriptor chains that provide the linked list of control blocks that specify source, destination and length of the data to be transferred. The DMA hardware has N queues and works on one queue at a time. The microcode puts chains on the queues and ensures that a queue is available before submitting the chain to prevent DMA queue overflow error. The hardware sets a completion bit when a chain completes and microcode must reset the bit before hardware can complete the next DMA queue.
Current designs rely on microcode to reset the DMA chain completion indicator in order for hardware to complete the processing of new chains. This indicator can be processed using a poll or interrupt approach. However, both approaches have problems. For example, the use of the interrupt mechanism results in serious performance impact because the CPU has to save and to restore all internal registers and stacks. In the poll method, time is wasted because the DMA chain completion indicator is not there or has been there long before the poll. Reading hardware registers to see the indicator costs more compared to reading DRAM. Therefore, both cases can cause the hardware to temporarily stop DMA operation until the indicator is reset.
It can be seen then that there is a need for a method, apparatus and program storage device for enabling multiple asynchronous direct memory access task executions.