The present invention relates generally to improvements in array processing, and more particularly to advantageous techniques for providing improved data transfer control.
Various prior art techniques exist for the transfer of data between system memories or between system memories and input/output (I/O) devices. FIG. 1 shows a conventional data processing system 100 comprising a host uniprocessor 110, processor local memory 120, I/O devices 130 and 140, a system memory 150 which is usually a larger memory store than the processor local memory and having longer access latency, and a direct memory access (DMA) controller 160.
The DMA controller 160 provides a means for transferring data between processor local memory and system memory or I/O devices concurrent with uniprocessor execution. DMA controllers are sometimes referred to as I/O processors or transfer processors in the literature. System performance is improved since the Host uniprocessor can perform computations while the DMA controller is transferring new input data to the processor local memory and transferring result data to output devices or the system memory. A data transfer is typically specified with the following minimum set of parameters: source address, destination address, and number of data elements to transfer. Addresses are interpreted by the system hardware and uniquely specify I/O devices or memory locations from which data must be read or to which data must be written. Sometimes additional parameters are provided such as element size. In addition, some means of initiating the data transfer are provided, and also provided is a means for the DMA controller to notify the host uniprocessor when the transfer is complete. In some conventional DMA controllers, transfer initiation may be carried out by programming specific registers within the DMA controller. Others are designed to fetch their own xe2x80x9ctransfer descriptorsxe2x80x9d which might be stored in one of the system memories. These descriptors contain the information required to carry out a specific transfer. In the latter case, the DMA controller is provided a starting address from which to fetch transfer descriptors and there must be some means for controlling the fetch operation. End-of-transfer (EOT) notification in conventional DMA controllers may take the form of signaling the host uniprocessor so that it generates an interrupt which may then be handled by an interrupt service routine. In other notification approaches, the DMA controller writes a notification value to a specified memory location which is accessible by the host uniprocessor. One of the limitations of conventional DMA controllers is that address generation capabilities for the data source and data destination are often constrained to be the same. For example, when only a source address, destination address and a transfer count are specified, the implied data access pattern is block-oriented, that is, a sequence of data words from contiguous addresses starting with the source address is copied to a sequence of contiguous addresses starting at the destination address. Another limitation of conventional DMA controllers is the overhead required to manage the DMA controller in terms of transfer initiation, data flow control during a transfer, and handling EOT notification.
With the advent of the ManArray architecture, it has been recognized that it will be advantageous to have improved techniques for carrying out such functions tailored to this new architecture.
As described in detail below, the present invention addresses a variety of advantageous methods and apparatus for improved data transfer control within a data processing system. In particular, improved mechanisms are provided for initiating and controlling the sequence of data transfers; decoupling source and destination address generation through the use of independent specification of source and destination transfer descriptors (hereafter referred to as xe2x80x9cDMA instructionsxe2x80x9d to distinguish them from a specific type of instruction called a xe2x80x9ctransfer instructionxe2x80x9d which performs the data movement operation); executing multiple xe2x80x9csourcexe2x80x9d transfer instructions for each xe2x80x9cdestinationxe2x80x9d transfer instruction, or multiple xe2x80x9cdestinationxe2x80x9d transfer instructions for each xe2x80x9csourcexe2x80x9d transfer instruction; intra-transfer control of the flow of data (control that occurs while a transfer is in progress); EOT notification; and synchronizing of data flow with a compute processor and with one or more control processors through the use of SIGNAL and WAIT operations on semaphores.
Additionally, the present invention provides a DMA controller implemented as a multiprocessor consisting of multiple transfer controllers each supporting its own instruction thread. It allows cooperation between transfer controllers seen in the DMA-to-DMA method addressed further below. It addresses single-thread of control of dual transfer units or execution units. Execution control of a transfer instruction may advantageously be based on a flag in the instruction itself. Multiple instructions may execute in one unit while a single instruction executes in the other. Independent transfer counters for CTU and STU are provided. Conditional SIGNAL instructions which can send messages on control bus, interrupts or update semaphores are advantageously provided, as is a conditional WAIT instruction which is executed based on the state of a semaphore. When a wait condition becomes false, this semaphore is updated according to instruction. Further aspects include the use of transfer conditions in branch, SIGNAL and WAIT instructions (STUEOT, CTUEOT, notSTUEOT, notCTUEOT). Further, the use of semaphores is addressed as the basis for conditional execution. A generalization of these techniques allows dual-CTU or dual-STU transfer controllers. A dual-CTU transfer controller might be used to perform DMA transfers from one cluster""s DMA bus to another cluster""s DMA bus. Further, a restart capability based on RESTART commands, Load-transfer-count-and-restart commands, or a semaphore update from an SCB master is addressed.
These and other advantages of the present invention will be apparent from the drawings and the Detailed Description which follow.