The present invention relates in general to a system and method for efficiently performing direct memory access operations, and in specific to a method and system that utilizes an enhanced direct memory access controller to perform both a desired data transfer operation and one or more data queue directory updates to properly reflect such desired data transfer.
Computer systems are heavily relied on today for performing a variety of tasks. Such computer systems are often required to handle data in some manner. For example, data is often transferred from one memory location (or address) to another memory location (or address). For instance, data may be transferred from one device to another device, from one device to memory, from a software application to a device, from a software application to memory, etcetera. Computer systems generally include at least one central processing unit (CPU or processor), which acts as the electronic xe2x80x9cbrainxe2x80x9d of a computer device. As is well known, the CPU is responsible for performing most calculations/instructions, and is often relied on for performing a transfer of data from one memory location to another memory location. In early computer systems, the CPU was responsible not only for the execution of programs, but was also responsible for transferring data to and from various memory locations (e.g., transferring data to and from peripheral devices, etcetera). For instance, the CPU typically operates on data stored in a main memory. Because there are practical size limitations on such main memory, bulk memory storage devices may be provided in addition to and separately from the main memory. When the CPU wants to make use of data stored in such a bulk storage device, such as a hard disk, for example, the data is typically moved from the hard disk into the main memory.
Utilizing the CPU to perform such data transfers is very inefficient because such data transfers prevent the CPU from performing other tasks, thereby hindering the overall efficiency of the computer system. Accordingly, direct memory access (DMA) is commonly utilized to enable computer systems to cut out the xe2x80x9cmiddle man,xe2x80x9d thereby allowing the CPU to perform other tasks. For example, a DMA chip (or DMA controller) is commonly included in computer systems to enable a peripheral device to effectively transfer data itself, leading to increased performance of the computer system. Prior art DMA systems and methods are well known by those of ordinary skill in the art, and therefore will only briefly be described hereafter.
DMA circuitry generally provides xe2x80x9cchannels,xe2x80x9d along with circuitry to control such channels, which allow the transfer of data without the CPU controlling every aspect of the transfer. Such circuitry is commonly part of the system chipset on the motherboard of a personal computer (PC), for example. When a device desires to move a block of data, the DMA controller receives descriptor information from the CPU as to the base location from where bytes are to be moved (i.e., the xe2x80x9csource addressxe2x80x9d), the address to where the bytes should be moved (i.e., the xe2x80x9cdestination addressxe2x80x9d), and the number of bytes to move (i.e., the xe2x80x9clengthxe2x80x9d of the block of data). Once it receives such descriptor information, the DMA controller oversees the transfer of the data within the computer system. Once the data move is complete, the DMA controller notifies the CPU of such completion. Normally, DMA operations are used to move data between input/output (I/O) devices and memory.
Turning to FIG. 1, a relatively simple example of a data move operation performed by a DMA is shown. As shown, a computer system 100 includes a first memory location 102 and a second memory location 104. For example, memory location 102 may be included within a hard disk or some other type of peripheral device, and memory location 104 may be the main memory of computer system 100. When a device or an application desires to transfer a block of data from memory location 102 to memory location 104, CPU 114 provides to DMA 106 the necessary descriptor information for identifying the desired transfer. That is, CPU 114 provides descriptor information that includes the source address 108 (i.e., the base address from where bytes are to be moved), the destination address 112 (i.e., the address to where the bytes should be moved), and the length 110 of the block of data to be moved. Based on the received descriptor information, DMA 106 performs the identified data transfer operation from memory location 102 to memory location 104. Once complete, DMA 106 notifies CPU 114 of the completion of the requested data transfer operation.
DMA 106 in FIG. 1 may be referred to as a xe2x80x9csimple DMA,xe2x80x9d in that it performs a data transfer that is identifiable by a single descriptor (e.g., a single source, destination, and length). However, a more complex DMA, which may be referred to as a xe2x80x9cchaining DMAxe2x80x9d is also available in the prior art. Such a chaining DMA is capable of performing a data transfer of a block of data that is not identifiable by a single descriptor. Turning to FIG. 2, an example of a data move operation that requires multiple descriptors for identification to be performed by a chaining DMA is shown. As shown, a computer system 200 includes a first memory location 202 and a second memory location 204, similar to that of FIG. 1 described above. For example, memory location 202 may be included within a hard disk or some other type of peripheral device, and memory location 204 may be the main memory of computer system 200. A device or application may desire to transfer data, which such device or application logically views as a block 208. That is, data may be treated as a logical xe2x80x9cblockxe2x80x9d 208 by an application and/or device, but such logical block 208 may not actually be a contiguous block within the physical memory. As shown in the example of FIG. 2, logical block 208 is actually divided among three separate memory locations (or sub-blocks) 210A, 210B, and 210C within first memory location 202. That is, logical block 208 comprises three separate source addresses 210A, 210B, and 210C.
Furthermore, each source address may have a different length. That is, the portion of data block 208 starting at source address 210A may include contiguous data having length 211A, the portion of data block 208 starting at source address 210B may include contiguous data having length 211B, and the portion of data block 208 starting at source address 210C may include contiguous data having length 211C, wherein lengths 211A, 211B, and 211C may be different. Additionally, each sub-block 210A, 210B, and 210C must have a different destination address. Otherwise, one sub-block would overwrite all or a portion of another of the sub-blocks. For example, if sub-blocks 210A, 210B, and 210C were all written to the exact same destination address, the latter sub-blocks to be written to such destination address would overwrite all or a portion (depending on the length of each sub-block) of the earlier written sub-blocks. Thus, multiple descriptors are required to identify the data transfer operation of logical block 208. More specifically, three separate source addresses, three separate destination addresses, and three separate lengths are required to identify the data transfer of block 208 from memory 202 to memory 204.
Accordingly, when a device or an application desires to transfer block 208 from memory location 202 to memory location 204, CPU 214 provides to DMA 206 the multiple descriptors necessary to identify such a data transfer. The multiple descriptors are referred to as being xe2x80x9cchainedxe2x80x9d together because the DMA 206 must complete all of the multiple data transfers before indicating to CPU 214 that the transfer of block 208 is complete. Thus, in the example of FIG. 2, DMA 206 will receive the three chained descriptors and perform the necessary operations to move the data of block 208 from first memory location 202 to second memory location 204.
A xe2x80x9cscatter/gatherxe2x80x9d algorithm is commonly utilized in this situation to cause DMA 206 to move the non-contiguous (or scattered) sub-blocks 210A, 210B, and 210C from memory location 202 to memory location 204 in a manner that xe2x80x9cgathersxe2x80x9d the sub-blocks as a contiguous block of memory, shown as 212. Of course, in other data transfers, the opposite may be true. For example, data block 208 may be a contiguous block of data within memory location 202, but may need to be xe2x80x9cscatteredxe2x80x9d into two or more sub-blocks when moved to memory location 204. For instance, memory location 204 may not have a sufficiently large contiguous block of memory available for writing block 208, and therefore may be required to xe2x80x9cscatterxe2x80x9d block 208 as separate sub-blocks within memory location 204. In either case, multiple descriptors that are xe2x80x9cchainedxe2x80x9d are supplied to DMA 206 to accomplish the data transfer. Once all of the data transfer operations identified by the chained descriptors are complete, DMA 206 notifies CPU 214 of the completion of the requested data transfer of block 208. It should be understood that to the device or application requesting the move of data block 208, the data transfer appears as a single move of data, even though in reality DMA 206 performs multiple moves of non-contiguous sub-blocks of data to accomplish the requested data transfer.
Thus, simple DMAs capable of receiving a single descriptor and performing a data move identified by such descriptor are available in the prior art. Also, xe2x80x9cchaining DMAsxe2x80x9d are available in the prior art, which are capable of receiving xe2x80x9cchainedxe2x80x9d descriptors and performing data moves identified by such chained descriptors.
Given that the DMA controller may receive more data transfer requests than it can service at one time, descriptors are generally queued in a xe2x80x9cdescriptor queue,xe2x80x9d which is generally managed by some type of queue management software. That is, descriptors are held in the descriptor queue (or command queue), and the DMA then services each descriptor (or each data transfer request identified by the descriptors) in turn. Turning to FIG. 3, an example illustrating this point is provided. As shown in FIG. 3, computer system 300 includes multiple xe2x80x9cclientsxe2x80x9d (e.g., devices and/or applications) that may each request a data move to be performed by DMA 306, which may be either a simple DMA or a chaining DMA.
For example, clients A, B, and C are included in computer system 300, each of which may interact with CPU 314 to request data move operations to be performed by DMA 306. Also included in computer system 300 is queue management software (which may be referred to as a xe2x80x9cdriverxe2x80x9d) 308. Queue management software 308 receives requests for data move operations from CPU 314, wherein such requests include descriptor(s) for identifying a desired data move operation, and queue management software 308 queues such requests in descriptor queue (or command queue) 310. Queue management software 308 then supplies each descriptor, in turn, to DMA 306, which performs each desired data move.
Suppose, for instance, that client A first requests a data move operation, while DMA 306 is busy performing a previously received data move request. CPU 314 communicates the request (i.e., the descriptor) from client A to queue management software 308, which queues the request, shown as request 312A, in descriptor queue 310. Thereafter, client B requests a data move operation. CPU 314 communicates the descriptor information to queue management software 308, which queues the request 312B in descriptor queue 310. Client A then requests another data move operation, which queue management software 308 queues as request 312C in descriptor queue 310, and thereafter, client C requests a data move operation, which queue management software 308 queues as request 312D in descriptor queue 310. When DMA 306 completes a data move request it notifies queue management software 308, which in turn notifies CPU 314 of such completion, and queue management software 308 then sends the next pending descriptor from queue 310 to DMA 306.
Oftentimes a device may have associated with it a buffer or queue for the actual data that it transmits to another location and/or receives from another location. Such a queue (or buffer) may be quite limited in size. That is, the data queue associated with a particular device may only be able to store a relatively small amount of data. Accordingly, in the prior art, queue management software (which may be referred to herein as a driver) is typically utilized to manage the data within a device""s data queue. For example, to efficiently utilize the limited amount of memory available in a device""s data queue, such data queue may be implemented as a circular queue. Turning to FIG. 4, an exemplary logical diagram of a circular data queue 400 is provided. As shown, data queue management software typically utilizes a xe2x80x9cheadxe2x80x9d pointer 402, which indicates the beginning of the data stored in circular queue 400, and data queue management software typically utilizes a xe2x80x9ctailxe2x80x9d pointer 404, which indicates the end of the data stored in circular queue 400. For example, a first block of data 406 is stored in circular queue 400, followed by a second block of data 408, and then a third block of data 410. Tail pointer 404 indicates the location at which a new block of data is to be added to circular queue 400. Thus, if data is to be added to circular queue 400, the queue management software utilizes tail pointer 404 to indicate the proper location for such data to be added within circular queue 400. Additionally, if block 406 is transferred to another memory location, then the queue management software moves head pointer 402 to correspond to the beginning of data block 408. Thus, the queue management software continually updates the head and tail pointers as data is transferred to/from queue 400. Such circular queue management technique is well known in the prior art.
A general desire exists for a system and method for efficient data transfer management. As described in greater detail hereafter, prior art DMA implementations typically require that queue management software (or a driver) remember a requested data transfer transaction during performance of such data transfer by the DMA, and upon completion of a data transfer by the DMA, the queue management software updates records (or xe2x80x9cdirectoriesxe2x80x9d) of data queues associated with the source and/or destination. Such an implementation that requires the queue management software to remember a requested transaction and update data source records to reflect the data transfer upon completion of such data transfer by the DMA is inefficient. Thus, a desire exists for an efficient data transfer management system and method that do not require queue management software to remember a requested transaction and update the records (or xe2x80x9cdirectoriesxe2x80x9d) of data queues after such transaction is completed by the DMA. Accordingly, a desire exists for an enhanced DMA, which is capable of not only performing one or more requested data transfer operations, but is also capable of updating one or more data queue directories as needed to properly reflect such data transfer operations.
Accordingly, the present invention is directed to a system and method which utilize an enhanced DMA to both perform a desired data transfer and update data queue directories as needed to properly reflect such data transfer. In a preferred embodiment, such an enhanced DMA is implemented to receive a data transfer request, and in response, performs the desired data transfer and updates one or more data queue directories to properly reflect such data transfer within the data queue directories. More specifically, in a preferred embodiment, the enhanced DMA is implemented to receive a data transfer request, which comprises at least one data transfer descriptor, which indicates the desired data transfer, and at least one record update descriptor, which indicates one or more data queue directories needing to be updated to properly reflect the desired data transfer.
Once the enhanced DMA performs the desired data transfer and necessary data queue directory updates, it notifies the requesting driver that the desired data transfer operation is complete. Because the enhanced DMA performs the necessary data queue directory updates, the requesting driver is not required to remember a requested data transfer operation and perform the data queue directory updates after receiving a completion notice from the enhanced DMA. Instead, once the requesting driver receives a completion notice from the enhanced DMA, the data transfer operation is fully completed, including any necessary data queue directory updates. Thus, no additional xe2x80x9cclean-upxe2x80x9d work is required to be performed by the requesting driver after receiving completion notice from the enhanced DMA. As a result, the data transfer management of a preferred embodiment is much more efficient than typical data transfer management of the prior art, wherein a requesting driver is required to perform xe2x80x9cclean-upxe2x80x9d (e.g., updating data queue directories to properly reflect a data transfer operation performed by the DMA) after receiving a completion notice from the DMA. Additionally, a preferred embodiment may enable a requesting driver to have a less complex implementation than is typically required in the prior art. For instance, requesting drivers of the prior art are typically required to xe2x80x9crememberxe2x80x9d a requested data transfer operation, and update data queue directories to reflect the requested data transfer after receiving notice from the DMA that the requested data transfer has been performed. However, in a preferred embodiment, a requesting driver is not required to xe2x80x9crememberxe2x80x9d a requested data transfer operation or update data queue directories to reflect the requested data transfer operation, as the enhanced DMA performs both the data transfer and the necessary data queue directory updates. Thus, a preferred embodiment may enable a less complex requesting driver to be implemented.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims. The novel features which are believed to be characteristic of the invention, both as to its organization and method of operation, together with further objects and advantages will be better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present invention.