This invention relates to the field of computer systems, and in particular to data access techniques.
The processing of data often requires the movement of data from one form of storage to another. As a program executes, for example, data is read from a memory device into registers, or temporary storage devices, for subsequent processing. The resultant processed data, contained in the same or different registers, is thereafter written back to the same or a different memory device.
Conventionally, data processing devices that process data contain memory access devices that facilitate the data transfer from memory to registers within the device. These memory access devices ease the programming task for processing the data by providing high-level commands to effect the data transfer. That is, for example, such devices will provide for a xe2x80x9cMOVE N, Src, Destxe2x80x9d command. This command moves N data elements from a source device, starting at the Src address, into a destination device, starting at the Dest address. The particular structure of a MOVE N command will vary among systems. For ease of understanding, the term MOVE N command used herein implies the transfer of N data elements, regardless of the data element size. Common synonymous terms for this MOVE N function are block data transfers, block reads, block writes, etc. Note that the very structure of this command necessitates a presumed ordering of the data elements to be moved, within the context of this command. That is, it is presumed that the transfer will be of N data elements from N contiguous locations in the source, starting from the Src address, to N contiguous destination addresses, starting from the Dest address. There is also a presumed or predefined direction of sequential address determination. That is, it is predefined whether the N contiguous data elements are located at addresses counting up from the Src address, or counting down from the Src address.
Block transfers provide for both programming and memory access efficiencies. For example, the memory device may be a spinning disk. If the memory access device accesses each data element independently, each transfer will incur a xe2x80x9cseekxe2x80x9d delay while waiting for the appropriate disk location to arrive at the head used to transfer the information. A block data transfer for a disk drive is designed to incur the seek delay for the first data element, then access sequential data from the disk at a rate commensurate with the rotation of the disk. Similarly, data access from memory devices such as RAM or ROM devices can be made more efficient by block data transfers. Often, there is a time delay associated with gaining access to the device, for example due to contention of a data bus of a computer system. Once access is granted, efficiencies can be achieved by transferring as many data elements as needed, thereby avoiding access delays for each data element.
Special purpose processing devices or applications commonly structure the processing of data to be consistent with the ordering of the registers used to send and receive the information to and from memory. That is, for example, if it is known that data elements A, B, and C are stored in memory contiguously at addresses M, M+1, and M+2, a first register address R in the processing device will be designated as the location that will contain the data element A; the next contiguous register, at R+1, will contain the data element B; and the next contiguous register, at R+2 will contain the data element C. In this manner, a xe2x80x9cMOVE 3, M, Rxe2x80x9d command will effect the direct transfer of the information A, B, C, from the memory to the registers. Thereafter, subprocesses within the processing device will access registers R, R+1, and R+2 whenever the data elements A, B, and C, respectively, are required.
As processing devices and applications become more and more complex, the number of data elements comprising the set of parameters used by the processing device or application increases dramatically. Consider, for example, the area of computer graphics. Relatively primitive graphics devices circa 1980 accepted a set of three (X, Y, RGB) parameters, and illuminated a picture element (pixel) on a display at location X, Y with a color combination corresponding to the red R, green G, and blue B color values. The Microsoft DirectX(trademark) standard that is used for graphic devices circa 1997 includes up to 16 different parameters, with xe2x80x98reservedxe2x80x99 capabilities to add others. (DirectX is a Trademark of Microsoft Corporation.) These parameters allow for three dimensional renderings with visibility determination, texturing, lighting, and so on. The graphics devices used to display these renderings have internal registers which are structured to contain each of these parameters, commonly structured to match the data structure in memory that conforms to the DirectX(trademark) standard. In this manner, a single MOVE 16 command effects the direct and efficient transfer of these parameters from the memory to the registers, as discussed above.
The transfer of all the parameters that are supported by a processing device for each processing operation can be very inefficient. For example, a simple illumination of a pixel only requires three parameters, as discussed above. The transfer of 16 parameters each time a pixel is to be simply illuminated results in a significant data transfer inefficiency. For this reason, xe2x80x9cvariable formatxe2x80x9d specifications have been developed. Commonly, a format command is communicated to the receiving processing device; this format command indicates which of the plurality of parameters will be communicated in a series of subsequent commands. In the simple illumination case, for example, the format command will indicate that only the X, Y, and RGB parameters need be communicated. To optimize data transfer, the process that is providing the X, Y, and RGB data elements for each pixel illumination places them in contiguous memory locations, independent of the data structure used to contain the full set of parameters. Thereafter, a MOVE 3 command will effect the transfer of these contiguously located data elements. This format is utilized to transfer subsequent data elements X, Y, and RGB until a new format command is received.
Unfortunately, the use of the MOVE 3 command will provide efficiency for transferring the data elements from the memory, but will place these storage elements in register locations which will not necessarily correspond to the structure of the internal registers assigned to contain these data elements. That is, the registers assigned for containing the X, Y, and RGB data elements may not be contiguous. Because the format for receiving data elements is variable and will not necessarily match the structure of the registers assigned to contain particular data elements, conventional memory access devices utilize a MOVE N command to receive the variably formatted data elements into contiguous xe2x80x9ctemporaryxe2x80x9d registers. Subsequent individual MOVE commands are then executed to move the data elements from the temporary registers to the appropriate assigned register locations. Such a memory access device, however, requires enough temporary registers to contain the maximum number of possible parameters, effectively doubling the number of registers required to contain the parameters. To minimize the number of registers required, another conventional approach is to forego the efficiencies of a MOVE N command and issue individual commands to move each data element from memory to its assigned register location. Some optimization may be provided, for example by using a block transfer for those data elements which are contained in the format command and also are assigned contiguous register locations, and individual commands for data elements which are assigned non-contiguous register locations, but such an approach obviates the programming advantages provided by a single MOVE N command.
Therefore, a need exists for a memory access device that can provide the access and transfer efficiencies of block transfers for data elements that are contiguously stored in memory to and from non-contiguous register locations. A need also exists for the memory access device to facilitate such a block transfer via a single block transfer command, such as MOVE N, Src, Dest.