Graphics systems sometimes include a transposer to change the format of data. In particular, memory sometimes stores data in a representation different than that which is required for certain data operations. As one example, memory may store a representation of vertex data in an array-of-structures format, i.e., vertex attributes X, Y, and Z on a vertex-by-vertex basis. However, certain data operations may be more conveniently handled by changing the data format to have all of the X attributes, Y attributes, and Z attributes grouped by attribute rather than by vertex. If you imagine the vertex data laid out in rows and columns, with one row per vertex, this operation is equivalent to interchanging rows and columns, which is a transposition operation. The transposition operation is useful, for example, in converting data from memory into a format suitable for performing primitive assembly and vertex processing. For example, consider vertices 1, 2, 3 . . . n each with attributes X, Y, and Z. Then data can be represented in memory in a table format as:                X1 Y1 Z1         X2 Y2 Z2         X3 Y3 Z3         Xn Yn Zn Example 1        
where each vertex can have an arbitrary number. M, of attributes (here only 3 for clarity). A transposer may be used to convert the data format into a format in which the data is transposed (the table is rotated by ninety degrees). For instance, starting with the table in Example 1, a transposer may rotate the source data into the table format:                X1 . . . XN         Y1 . . . YN         Z1 . . . ZN Example 2where a the rows and columns have been transposed.        
One problem in a graphics system with performing transposition is determining how to efficiently schedule read operations for the transposer. In particular, a transposer may be designed to read blocks of data per pass of the transposer. If the data is fed into the transposer in an inefficient order additional passes (and therefore additional clock cycles) will be required to perform the transposition. Conventionally, a greedy algorithm is used to schedule work for a transposer. However the greedy algorithm can result in inefficient data transfer.
Therefore, in light of the above described problem, the apparatus, system and method of the present invention was developed.