In a typical graphics processing system, graphics data may be provided to a shader. The graphics data may be computed using various graphics processing techniques and thereupon provided to the shader, wherein the shader performs shading operations in accordance with known shading techniques.
In some graphics processing systems, the shader may be an application specific integrated circuit (ASIC), which requires the graphics data to be within a specific format prior to shading. FIG. 1 illustrates a prior art processing system 100 for providing graphics data to a shader 102. A processor 104, such as an ASIC, receives a plurality of input pixels 106 from a pixel input 108. In one embodiment, the pixel input 108 may be a memory device, such as, but not limited to, a single memory, a plurality of memory locations, shared memory, CD, DVD, ROM, RAM, EEPROM, optical storage, microcode, or any other non-volatile storage medium. In another embodiment, the pixel input 108 may represent one or more busses within a pipeline for data processing.
The processor 104 thereupon performs a functional operation on the input pixels 106 to generate the output pixels 110, which are provided to a pixel output 112. In one embodiment, the pixel output 112 may be a memory device, such as, but not limited to, a single memory, a plurality of memory locations, shared memory, CD, DVD, ROM, RAM, EEPROM, optical storage, microcode, or any other non-volatile storage medium. In another embodiment, the pixel output 112 may represent one or more busses within a pipeline for data processing.
In accordance with known graphics processing techniques, the shader 102 may thereupon perform shading operations on the graphics data and provides an output object (not shown) to a display or other graphics processing system.
The system 100 of FIG. 1 contains inefficiencies. For example, the processor 104 has limited bandwidth for transferring the input pixels 106 from the pixel input 108 to the pixel output 112. In one embodiment, the input pixels 106 may be arranged in a linear fashion wherein the pixel output 112 requires the pixels within a matrix array format for the graphics shader 102. For example, the input pixels 106 may be linear due to encoded or compression requirements which need to be reformatted by the processor 104.
As recognized by one having ordinary skill in the art, a central processing unit (not illustrated) may be utilized to provide an input of data to the pixel input 108. Re-arrangement of this data may be desired prior to output by the pixel output 112. The processor 104 is capable of handling large data transfers, but typically the processor 104 is required to perform data re-arrangement on a pixel by pixel basis, which provides for an extremely inefficient system. One proposed solution is the off-loading of this data re-arrangement processing to a central processing unit. Although, this solution provides other inefficiencies due to the usage of CPU cycles for performing these operations. Another proposed solution is to create a large number of small primitives for the graphics pipeline to carry out the data, movement. Although, due to the small amount of data in each transfer, overhead inefficiencies are very noticeable in a processing system.
Therefore, there exists a need for a method and apparatus for performing data re-arrangement for arranging graphics data for performing shading operations.