1. Field of the Invention
This invention generally relates to Direct Access Memory (DMA) processing and, more particularly, to a channel-less multithreaded DMA controller.
2. Description of the Related Art
A general purpose programmable DMA controller is a software-managed programmable peripheral block charged with moving or copying data from one memory address to another memory address. The DMA controller provides a more efficient mechanism to perform large data block transfers, as compared to a conventional general purpose microprocessor. The employment of DMA controllers frees up the processor and software to perform other operations in parallel. Instruction sequences for the DMA, often referred to as control descriptors (CDs), are set up by software and usually include a source address, destination address, and other relevant transaction information. A DMA controller may perform other functions such as data manipulations or calculations.
Control descriptors are often assembled in groups called descriptor sequences or rings. Typically, the software control of a DMA controller is enabled through a device specific driver. The device driver is responsible for low level handshaking between upper layer software and the hardware. This device driver manages the descriptor rings, communicates with the DMA controller when work is pending, and communicates with upper layer software when work is complete.
It is possible that a DMA controller may be shared by many concurrent software threads running on one or more processors. Conventionally, a DMA controller maintains the logical concept of “channels”, whereby a channel provides the interface between a single software thread and the DMA controller. In other words, each software thread is associated with a channel. More concurrent driver threads require more DMA controller channels.
It has often been practice to provide multiple channels to address two concerns: thread sharing and quality of service. Oftentimes, multiple concurrent threads are used in systems for the parallel processing different aspects of data flow. Where multiple independent threads are deployed, sharing a common DMA controller can be cumbersome to manage. The software device driver in this case must not only provide the DMA controller with communications, but must also manage an arbitration scheme with upper layer software threads to determine which work gets done next. If this work is being carried out by multiple microprocessors, the overhead of coordination between threads is very complicated. The overhead coordination requires a certain level of software handshaking to determine which thread gets access to the controller at any particular time.
From a quality of service perspective, it is common to have higher and lower priority activities. For example, a software thread may queue a low priority transfer for a DMA controller. At some later time a different thread may be queued, which needs to run a higher priority task on the DMA controller. The ability to pre-empt low priority activities with higher priority tasks is a highly desired feature. Without such capability, a high priority operation must wait until a low priority operation is completed.
A multi-channel DMA controller addresses these issues where different software threads can be bound to specific channels and the underlying DMA controller hardware sorts out the access profile to memory based upon channel priorities. A disadvantage of the channel approach is that there are a limited number of hardware channels. If more logical threads exist than physical channels, then a software mechanism must once again be deployed to take care of the resource contention.
DMA controllers must also maintain a certain level of atomicity with respect to the execution of operations. This means that operational sequences must complete in the order programmed by software. This is typically accomplished using a run-to-completion model whereby a DMA channel completes all operations of a first CD, before moving onto the next CD. However, a brute-force run-to-completion methodology may prevent the data moving engine from performing un-related operations (operations from different CD lists) in parallel, even if the engine is capable.
The communication between software and the DMA controller hardware is typically handled through a programmed input/output (IO) interface. That is, the software device driver programs control registers within the DMA channel, which causes the DMA to carry out the desired action. When the DMA controller is finished it communicates back to software, either through use of a hardware interrupt request, or through setting of a status bit that is polled by software. The software must wait until the current instruction sequence is complete before programming the next sequence. During the software/hardware handshake period the DMA channel is idle waiting for the next CD, thus resulting in dead time that could have been used for real work. To overcome this dead time, DMA controllers may deploy the concept of control descriptor sequences (CDS) and descriptor rings.
The descriptor ring provides a form of FIFO where the software adds new items in memory for the DMA channel at the tail of the ring, while the DMA controller processes CDs from the head of the ring. In this way, the software manages the tail pointers and the hardware (HW) manages the head pointer. Such schemes have the disadvantage of requiring software overhead to keep track of pointers.
It would be advantageous if the programming model limitations of having fixed DMA channels as main linkage between software and hardware could be eliminated, while preserving thread independence, intra-thread atomicity, and a quality of service capability.