Efficient modification of data in memory relevant to display rendering plays a central role in the determining the performance of graphics processing operations. Data stored in a designated portion of memory may correspond directly with pixels associated with an image. For example, if 32 bits of data are used to represent each pixel in the image, each pixel may correspond with four bytes of storage within the designated memory space. A rectangular image region that is 1400 pixels by 1050 pixels, for instance, would occupy 5.88 Megabytes of memory storage. The data in memory corresponding to each pixel may be used to represent one or more values, such as color values, depth values, stencil values, opacity values, etc., associated with that pixel. By modifying the associated data stored in the designated memory space, the image itself may be correspondingly modified. Here, the term “pixel” is used in a general sense to refer to an elemental unit of an image. In some cases, the image may be presented to a viewer on a display device. In other cases, the image may not be directly displayed at all to any viewer. For example, texture mapping involves the application of a two-dimensional surface onto a three dimensional object. This process may be analogized as “wallpapering” or “tiling” the two-dimensional surface onto the three-dimensional object. The two-dimensional surface is composed of units commonly referred to as “texels,” and the collection of texels making up the two-dimensional surface is of commonly referred to as a texture bitmap. Thus, an example of an image referred to here may include a texture bitmap. Also an example of a pixel may include a texel that is part of a texture bit map.
FIG. 1 is a block diagram of an illustrative computer system 100 capable of modifying data in memory corresponding to an image. As shown, computer system 100 includes a graphics card 102, a central processing unit (CPU) 104, a chipset comprising a northbridge chip 106 and a southbridge chip 108, system memory 110, PCI slots 112, disk drive controller 114, universal serial bus (USB) connectors 116, audio CODEC 118, a super I/O controller 120, and keyboard controller 122. As shown in FIG. 1, graphics card 102 includes a graphics processing unit (GPU) 124 and local memory 126. Also, graphics card 102 is connected to a display 128 that may be part of computer system 100. Here, GPU 124 is a semiconductor chip designed to perform graphics processing operations associated with rendering an image that may be presented on display 128.
A portion of memory space in local memory 126 may be used to correspond to a particular image such as a screen area on display 128. Thus, data stored at certain storage locations in the portion of memory may be modified, in order to effectuate changes to corresponding pixel areas within the image. This may occur in real time such that a viewer would nearly instantaneously see the changes occur to the corresponding pixels areas on display 128. The coordination of which memory locations in local memory 126 to modify and the carrying out of those modifications, to effectuate the desired changes to the corresponding image, may be handled by GPU 124. Alternatively or additionally, system memory 110 may also be used to correspond to a particular image such as a screen area on display 128. Thus, certain storage locations in a portion of memory in memory 110 may be modified, in order to effectuate changes to corresponding pixel areas within a particular image. Again, GPU 124 may handle the coordination of which memory locations to modify and the carrying out of those modifications, to effectuate the desired changes to the corresponding image. Here, data and control signals may need to traverse greater distances in computer system 100, such as through north bridge chip 106. Thus, use of system memory 110 for storing data corresponding to an image may involve longer delays than use of local memory 126. GPU 124 is described here merely as an example of equipment used to perform graphics and memory operations. Such operations may be performed by other types of equipment, such as a general purpose processor, field programmable gate array (FPGA), application-specific integrated circuit (ASIC) and/or others. Computer system 100 and its components shown in FIG. 1 is presented here simply for illustrative purposes.
GPU 124 may modify data in memory corresponding to an image in a variety of different ways. For example, such memory modifications may be performed one pixel at a time. That is, for an image represented by a group of pixels within an image, it is possible to make a modification to the image by issuing instructions to GPU 124 to modify data in memory corresponding to each pixel. Also, memory modifications may be made one pixel area at a time. Here, for an image represented by a group of pixels within an area, such as a rectangular pixel area, it may be possible to make a modification to the image by issuing a single instruction to GPU 124 to modify data in memory corresponding to the pixel area. For example, a BLIT operation copies a source pixel area to a destination pixel area in the image. GPU 124 may respond to an instruction to perform a BLIT operation by performing a read operation to read data in memory locations corresponding to the source pixel area, followed by a write operation to write that data to memory locations corresponding to the destination pixel area. The instruction for a BLIT operation may specify coordinates to identify the source pixel area, as well as coordinates to identify the location of the destination pixel area. Of course, there may be variations in the manner in which such parameters are specified.
Operations such as BLITs have traditionally been conducted in a purely serial manner. For example, a BLIT operation would be not be allowed to start until all previous BLIT operations have completed. Because the source of one BLIT operation may depend on the destination of a prior BLIT operation, such serial execution has been adopted to prevent errors in the sequencing of read and write operations for multiple BLIT operations. However, these read and write operations may require relatively large amounts of time to complete. As a result, purely serial execution of BLIT operations can be highly inefficient. What is needed is a technique for processing operations such as BLITs in a more parallel fashion, without incurring errors in the proper sequencing of associated read and write operations. Such an enhancement would have a significant and positive impact on the performance of graphics systems.