1. Field of the Invention
The present invention generally relates to an apparatus and method for performing DMA data transfer, and more particularly to an apparatus and method for performing DMA data transfer in which a transfer destination device determines the termination of a DMA data transfer when multiple DMA data transfers are performed.
2. Description of the Related Art
DMA (direct memory access) allows data to be transferred without any intervention of a CPU. Therefore, even when the data is transferred between a memory and a device, the CPU can perform other processes concurrently, improving the throughput as a system. In recent years, the bandwidth of the memory (the data transfer capability of the memory) has increased and the bandwidth of the device has increased as in 10 Gbit Ethernet (trademark of IBM Corporation) in the field of data communication, and therefore, higher performance is desired for the DMA as well.
For example, it is known that data is transferred among a plurality of memories at a high speed by continuously performing DMA data transfers (data transfers by the DMA) among the plurality of memories using a transfer number (the number of times of transfer) parameter (refer to Japanese Patent Laid-Open No. 4-236649).
It is also known, for example, that the power consumption of a DMA controller can be reduced by reducing the group of registers for holding addresses (refer to Japanese Patent Laid-Open No. 2003-316722).
In the DMA, large size data can be transferred. In this case, it is necessary that the CPU divides the data into block data of a size in which the DMA controller can transfer the data, and issues a data transfer instruction to the DMA controller continuously. When the data transfer source is mapped by a virtual address, it is necessary to divide the data to prevent one piece of block data from spanning the boundary of the pages of virtual addresses.
The CPU itself also performs a plurality of processes for issuing DMA instruction by a time-sharing operation in processes using multiple threads. As a result, the CPU issues a plurality of (or multiple) DMA requests for different transfer data to a DMA controller.
As described above, a DMA controller increases the number of DMA engines, increases the number of times of data transfer (multiplicity of data transfer) to be concurrently performed, and manages the transfer without fail. Otherwise, the high-performance process cannot be performed. The Inventor studies a DMA data transfer apparatus 200 shown in FIGS. 8 to 11. The DMA data transfer by the DMA data transfer apparatus 200 is briefly described below.
FIG. 8 shows a structure of the DMA data transfer apparatus 200 as the background of the present invention. FIG. 9 shows an operation sequence of the DMA data transfer by the DMA data transfer apparatus shown in FIG. 8.
In the following, one piece of transfer data A is concentrated on and described. When there is a plurality of transfer data, the DMA data transfer described below is concurrently performed on the plurality of transfer data. In the following description, the data divided and transferred by a DMA engine 221 is referred to as block data. The A1, A2, . . . , Am (1 to m are positive integers in the descriptions above and below) are block data. The smaller digits, the closer the block data to the head of the transfer data. In FIG. 9, a read of data to each DMA engine 221 from a memory controller 230 is omitted.
The CPU 210 issues a transfer instruction (DMA data transfer request) for the block data A1 (A1 transfer instruction) to the DMA engine 221 to which data transfer has not yet been assigned (step S100). The DMA engine 221 which has received a transfer instruction for the block data A1 transfers the block data A1 to a transfer destination device 240 (step S101). Afterwards, as in the step S100, a CPU 210 issues a transfer instruction for the block data A2 (A2 transfer instruction) (step S102). Upon receipt of the transfer instruction, the DMA engine 221 transfers the block data A2 as in the step S101 (step S103). Then, the block data A1 or A2 is transferred, the DMA engine 221 each issues a transfer termination notice (A1 transfer termination notice or A2 transfer termination notice) to a transmission check processing unit 222 (steps S104 and S105).
As described above, the block data transfer process is performed. The CPU 210 issues a transfer instruction for the first previous block data Am-1 before the last block data Am (Am-1 transfer instruction) as in the step S100 (step S106), and the block data Am-1 is transferred (step S107). Furthermore, the CPU 210 issues a transfer instruction for the last block data Am (Am transfer instruction) as in the step S100 (step S108), and the block data Am is transferred (step S109). At this time, in the step S108, the CPU 210 notifies the DMA engine 221 for transferring the data Am that the data Am is the last block data.
For example, since the size of the last block data Am is smaller than the size of the first previous block data Am-1, the transfer of the block data Am can terminate faster than the transfer of the block data Am-1 as shown in FIG. 9. Therefore, the DMA engine 221 which transfers the block data Am terminates the transfer of the block data Am before the termination of the transfer of the block data Am-1, and issues a last block data transfer termination notice to the transmission check processing unit 222 (step S110). Afterwards, the DMA engine 221 for transferring the block data Am-1 terminates the transfer of the block data Am-1, and issues a transfer termination notice for the block data Am-1 to the transmission check processing unit 222 (step S111).
The transmission check processing unit 222 receives a last block data transfer termination notice from the DMA engine 221 which transferred the block data Am, confirms that all block data has been transferred by receiving the transfer termination notice for the block data Am-1, and then issues an all data transfer termination notice for the transfer data A to the transfer destination device 240 (step S112). The transfer destination device 240 has a data buffer 241 for storing the received block data A1 to Am. Upon receipt of the all data transfer termination notice from the transmission check processing unit 222, the transfer destination device 240 determines that the transfer of the transfer data A has been completed, and performs another process.
To transfer the above-mentioned DMA data, it is necessary for the transmission check processing unit 222 to have a DMA transfer management table 224 with a DMA transfer control unit 223, as shown in FIG. 10. The DMA transfer management table 224 has a management area in a matrix form of the number of transfer data x the number of DMA engines, and has a last block data transfer flag for each piece of transfer data, as shown in FIG. 11. In each of the management area matrices, the status of “unassigned” or “being transferred” is recorded. The last block data transfer flag is set when the DMA transfer control unit 223 receives a last block data transfer termination notice from the DMA engine 221 which transferred the last block data.
In the transmission check processing unit 222, the DMA transfer control unit 223 updates the DMA transfer management table 224. That is, when the DMA engine 221 related to the transfer is assigned by a data transfer instruction from the CPU 210, the status of the portion of the corresponding matrix of the DMA transfer management table 224 is changed from “unassigned” to “being transferred”. When the DMA engine 221 which has terminated the transfer of block data is released, the status of the portion of the corresponding matrix of the DMA transfer management table 224 is changed from “being transferred” to “unassigned”. When the last block data transfer termination notice is received from the DMA engine 221 which transferred the last block data, the last block data transfer flag of the corresponding transfer data of the DMA transfer management table 224 is changed to “1”.
The DMA transfer control unit 223 checks whether or not the last block data transfer flag of each of the transfer data of the DMA transfer management table 224 is “1”. The DMA transfer control unit 223 issues an all data transfer termination notice for the corresponding transfer data to the transfer destination device 240, when there is no DMA engine 221 whose status is “being transferred (or in the middle of transferring)” for the transfer data having the last block data transfer flag of “1”.
The inventor further studied the DMA data transfer apparatus shown in FIGS. 8 to 11, tried to improve the performance of the DMA controller 220, and knew that the following problems occur with the transmission check processing unit 222.
It is necessary for all DMA engines 221 related to the transfer to check whether or not the transfer of the block data has been completed to the transfer destination device 240. Therefore, when the multiplicity of the DMA engine 221 is enhanced (the number of the DMA engine 221 is increased), the entry of the number of DMA engines, that is, the columns, increases in the DMA transfer management table 224.
When a larger number of transfer data are simultaneously processed, the entry of the number of transfer data, that is, the rows, increases in the DMA transfer management table 224. That is because when there are a large number of transfer data and the transfer data is dynamically managed, the data transfer process cannot be immediately performed in a case where all management entries of the transfer data are being used although the status of some DMA engines 221 is “unassigned”.
Also when the number of DMA engines and the number of transfer data that can be simultaneously processed are increased, the related processes also increase. That is, the resources acquiring/releasing process and so on, for assigning the DMA engine 221 that has terminated the transfer of data to the next transfer increases in proportion to the number of DMA engines×the number of transfer data. Therefore, a resultant circuit is complicated.
Furthermore, to improve the performance of the DMA, it is necessary to increase the number of DMA engines and the number of transfer data with appropriate balance between them. However, when the improvement of the performance of the DMA data transfer is performed with increasing number of DMA engines and increasing number of transfer data, the scale of the DMA data transfer circuit becomes large and the circuit becomes complicated as described above.