Matrices are utilized in many different computing processes. There are a number of different operations that may be applied to a matrix. One of the most important operations is the transpose operation. In the transpose operation, the rows and columns of an input matrix are reversed. In formal terms, if A is any matrix whose (i,j)-entry is aij, the transpose of A (denoted AT) is the matrix whose (i, j)-entry is aji, the matrix obtained by reflecting in the main diagonal. This new matrix may then be denoted B. Thus if the original matrix A had M rows and N columns (i.e. AM×N) then the transpose operation results in BN×M.
For example:
            [                                    0                                2                                4                                                1                                3                                5                              ]        T    =      [                            0                          1                                      2                          3                                      4                          5                      ]  
The transpose operation is used frequently in remote-sensing data processing, seismic data processing, signal processing, and image processing, as well as for data rearrangement for more efficient computation in other transformations and applications such as Fast Fourier Transform (FFT). If the application requires large data matrices, in-place transpose must be used as there is often not enough memory to hold both a large matrix and its transpose.
The transpose operation may be performed out-of-place and in-place in computers. In out-of-place transpose, the input and output matrices reside in two distinct memory locations. The input matrix is not changed and the result of the operation is placed in the output matrix. The output matrix has at least the size of the input matrix. This can be done by simply copying every element in input matrix to its reversed position in output matrix. The processing is trivial but inefficient. In-place transpose is much more complicated. In in-place transpose, the input and output matrices occupy the same memory location. A workspace, normally much smaller than the input matrix, may be required to be used as a cache to improve performance. While out-of-place transpose is faster than in-place transpose, it required more available memory. Thus, the determination of whether to use out-of-place or in-place transpose in a particular system is largely dependent upon which is more valuable in that system, time or space.
The in-place transpose operation for a square matrix, a matrix in which the number of rows equals the number of columns, may be accomplished by swapping the elements about the main diagonal. It should be pointed out that is a very inefficient approach. For a rectangular matrix, a method known as cyclic permutation may be utilized to do the in-place transpose.
Consider a 3×4 matrix:
  A  =      [                            0                          3                          6                          9                                      1                          4                          7                          10                                      2                          5                          8                          11                      ]  and its transpose:
      A    T    =      [                            0                          1                          2                                      3                          4                          5                                      6                          7                          8                                      9                          10                          11                      ]  When stored in computer memory, the elements reside sequentially as:Am=[0 1 2 3 4 5 6 7 8 9 10 11]andATm=[0 3 6 9 1 4 7 10 2 5 8 11]
where m indicates the memory layout of the matrix. The value of each element also serves as the index of the element in the computer memory.
If one compares Am and ATm it can be seen that the transpose may actually be performed by two cyclic permutations:(3←9←5←4←1←3←)and(8←2←6←7←10←8←)where the arrows indicate the direction of the element movement in order to achieve the transformation. These permutation cycles are independent from each other. The first and last elements are not moved. The permutation cycles may be kept in a vector called the permutation vector. The permutation vector for the example above would be:(3 9 5 4 1 0 8 2 6 7 10 0 0)where a single 0 represents the termination of the permutation cycle and a double 0 indicates the termination of the permutation vector. This implementation, however, requires the checking of both the cycle terminator and the vector terminator for every element to be moved when the permutation is actually performed.
Representing the permutations as matrices is helpful in understanding the processing, even though at the computer level the matrices are represented as vectors.
                    A        =                ⁢                              [                                                            0                                                  3                                                  6                                                  9                                                                              1                                                  4                                                  7                                                  10                                                                              2                                                  5                                                  8                                                  11                                                      ]                    ⁢                                    ⇒              First                        Permutation                    ⁢                    ⁢                      [                                                            0                                                                      9                    _                                                                    6                                                                      5                    _                                                                                                                    3                    _                                                                                        1                    _                                                                    7                                                  10                                                                              2                                                                      4                    _                                                                    8                                                  11                                                      ]                    ⁢                                    ⇒              Permutation                        Second                                                          ⁢                  [                                                    0                                            9                                                              7                  _                                                            5                                                                    3                                            1                                                              10                  _                                                                              8                  _                                                                                                      6                  _                                                            4                                                              2                  _                                                            11                                              ]                                        =                ⁢                                         [                                                            0                                                  1                                                  2                                                                              3                                                  4                                                  5                                                                              6                                                  7                                                  8                                                                              9                                                  10                                                  11                                                      ]                                                  =                ⁢                  A          T                    where underlined elements are those moved during the permutations. The notation VM,N may be used for a permutation vector used to transpose an M×N matrix.
However, applying a permutation vector in the computing realm is not a trivial matter. Simply applying a permutation vector on a large matrix is very expensive. A much better method, a four-step method, can be used. This method is best described using the following example.
Consider a 6×4 matrix:
  A  =      [                            0                          6                          12                          18                                      1                          7                          13                          19                                      2                          8                          14                          20                                      3                          9                          15                          21                                      4                          10                          16                          22                                      5                          11                          17                          23                      ]  
This matrix can be partitioned into a 3×2 matrix of submatrices of size 2×2
  A  =      [                                        [                                                            0                                                  6                                                                              1                                                  7                                                      ]                                                [                                                            12                                                  18                                                                              13                                                  19                                                      ]                                                            [                                                            2                                                  8                                                                              3                                                  9                                                      ]                                                [                                                            14                                                  20                                                                              15                                                  21                                                      ]                                                            [                                                            4                                                  10                                                                              5                                                  11                                                      ]                                                [                                                            16                                                  22                                                                              17                                                  23                                                      ]                                ]  
In step 1, the partitioned matrix A may be treated as two 3×2 submatrices of vectors of length 2, with 6 vectors per submatrix:
  A  =      [                                        [                                                                                〈                                                                                            0                                                                                                                      1                                                                                      〉                                                                                        〈                                                                                            6                                                                                                                      7                                                                                      〉                                                                                                                    〈                                                                                            2                                                                                                                      3                                                                                      〉                                                                                        〈                                                                                            8                                                                                                                      9                                                                                      〉                                                                                                                    〈                                                                                            4                                                                                                                      5                                                                                      〉                                                                                        〈                                                                                            10                                                                                                                      11                                                                                      〉                                                                        ]                                                [                                                                                〈                                                                                            12                                                                                                                      13                                                                                      〉                                                                                        〈                                                                                            18                                                                                                                      19                                                                                      〉                                                                                                                    〈                                                                                            14                                                                                                                      15                                                                                      〉                                                                                        〈                                                                                            20                                                                                                                      21                                                                                      〉                                                                                                                    〈                                                                                            16                                                                                                                      17                                                                                      〉                                                                                        〈                                                                                            22                                                                                                                      23                                                                                      〉                                                                        ]                                ]  
Then each of the two submatrices is transposed by permutation using the termination vector (the terminators are omitted in this example to improve readability):V3,2=(3 4 2 1)to get:
      A    ⇒          [                                                  [                                                                                          〈                                                                                                    0                                                                                                                                1                                                                                              〉                                                                                                  〈                                                                                                    8                                                                                                                                9                                                                                              〉                                                                                                                                  〈                                                                                                    6                                                                                                                                7                                                                                              〉                                                                                                  〈                                                                                                    4                                                                                                                                5                                                                                              〉                                                                                                                                  〈                                                                                                    2                                                                                                                                3                                                                                              〉                                                                                                  〈                                                                                                    10                                                                                                                                11                                                                                              〉                                                                                  ]                                                          [                                                                                          〈                                                                                                    12                                                                                                                                13                                                                                              〉                                                                                                  〈                                                                                                    20                                                                                                                                21                                                                                              〉                                                                                                                                  〈                                                                                                    18                                                                                                                                19                                                                                              〉                                                                                                  〈                                                                                                    16                                                                                                                                17                                                                                              〉                                                                                                                                  〈                                                                                                    14                                                                                                                                15                                                                                              〉                                                                                                  〈                                                                                                    22                                                                                                                                23                                                                                              〉                                                                                  ]                                          ]        =      A    1  
In step 2, A1 is treated as a 1×6 matrix of submatrices of size 2×2:
      A    1    =      [                                        [                                                            0                                                  6                                                                              1                                                  7                                                      ]                                                [                                                            2                                                  8                                                                              3                                                  9                                                      ]                                                [                                                            4                                                  10                                                                              5                                                  11                                                      ]                                                [                                                            12                                                  18                                                                              13                                                  19                                                      ]                                                [                                                            14                                                  20                                                                              15                                                  21                                                      ]                                                [                                                            16                                                  22                                                                              17                                                  23                                                      ]                                ]  
Permutations using the permutation vector:V2,2=(2 1)may then performed on each of the submatrices to get:
                              A          1                ⇒                ⁢                  [                                                                      [                                                                                    0                                                                    1                                                                                                            6                                                                    7                                                                              ]                                                                              [                                                                                    2                                                                    3                                                                                                            8                                                                    9                                                                              ]                                                                              [                                                                                    4                                                                    5                                                                                                            10                                                                    11                                                                              ]                                                                              [                                                                                    12                                                                    13                                                                                                            18                                                                    19                                                                              ]                                                                              [                                                                                    14                                                                    15                                                                                                            20                                                                    21                                                                              ]                                                                              [                                                                                    16                                                                    17                                                                                                            22                                                                    23                                                                              ]                                                              ]                                        =                 ⁢                  A          2                    
A2 may then be thought of as a 3×2 matrix of submatrices of size 2×2. A permutation using the permutation vector:V3,2=(3 4 2 1)may then be performed on the matrix to get:
            A      2        ⇒         ⁢          [                                                                                    [                                                                          ⁢                                                                                    0                                                                    1                                                                                                            6                                                                    7                                                                              ⁢                                                                          ]                                ⁢                                                                  [                                                                  ⁢                                                                            12                                                              13                                                                                                  18                                                              19                                                                      ⁢                                                                  ]                            ⁢                                                          [                                                          ⁢                                                                    2                                                        3                                                                                        8                                                        9                                                              ⁢                                                          ]                        ⁢                                                  [                                                  ⁢                                                            14                                                  18                                                                              20                                                  21                                                      ⁢                                                  ]                    ⁢                                          [                                          ⁢                                                    4                                            8                                                                    10                                            11                                              ⁢                                          ]                ⁢                                  [                                  ⁢                                            16                                      17                                                          22                                      23                                      ⁢                                  ]            ]        =      A    3  
In step 4, A3 is treated as three 2×2 submatrices of vectors of length 2:
      A    3    =      [                                        [                                                                                〈                                                                                            0                                                                                                                      6                                                                                      〉                                                                                        〈                                                                                            12                                                                                                                      18                                                                                      〉                                                                                                                    〈                                                                                            1                                                                                                                      7                                                                                      〉                                                                                        〈                                                                                            13                                                                                                                      19                                                                                      〉                                                                        ]                                                [                                                                                〈                                                                                            2                                                                                                                      8                                                                                      〉                                                                                        〈                                                                                            14                                                                                                                      20                                                                                      〉                                                                                                                    〈                                                                                            3                                                                                                                      9                                                                                      〉                                                                                        〈                                                                                            15                                                                                                                      21                                                                                      〉                                                                        ]                                                [                                                                                〈                                                                                            4                                                                                                                      10                                                                                      〉                                                                                        〈                                                                                            16                                                                                                                      22                                                                                      〉                                                                                                                    〈                                                                                            16                                                                                                                      22                                                                                      〉                                                                                        〈                                                                                            17                                                                                                                      23                                                                                      〉                                                                        ]                                ]  
Each of the three submatrices is transposed by permutation using the permutation vector:V2,2=(2 1)to finish the transpose processing:
                              A          3                ⇒                ⁢                  [                                                                      [                                                                                                              〈                                                                                                                    0                                                                                                                                                    6                                                                                                              〉                                                                                                                      〈                                                                                                                    1                                                                                                                                                    7                                                                                                              〉                                                                                                                                                              〈                                                                                                                    12                                                                                                                                                    18                                                                                                              〉                                                                                                                      〈                                                                                                                    13                                                                                                                                                    19                                                                                                              〉                                                                                                      ]                                                                              [                                                                                                              〈                                                                                                                    2                                                                                                                                                    8                                                                                                              〉                                                                                                                      〈                                                                                                                    3                                                                                                                                                    9                                                                                                              〉                                                                                                                                                              〈                                                                                                                    14                                                                                                                                                    20                                                                                                              〉                                                                                                                      〈                                                                                                                    15                                                                                                                                                    21                                                                                                              〉                                                                                                      ]                                                                              [                                                                                                              〈                                                                                                                    4                                                                                                                                                    10                                                                                                              〉                                                                                                                      〈                                                                                                                    5                                                                                                                                                    11                                                                                                              〉                                                                                                                                                              〈                                                                                                                    16                                                                                                                                                    22                                                                                                              〉                                                                                                                      〈                                                                                                                    17                                                                                                                                                    23                                                                                                              〉                                                                                                      ]                                                              ]                                        =                ⁢                                         [                                                            0                                                  1                                                  2                                                  3                                                  4                                                  5                                                                              6                                                  7                                                  8                                                  9                                                  10                                                  11                                                                              12                                                  13                                                  14                                                  15                                                  16                                                  17                                                                              18                                                  19                                                  20                                                  21                                                  22                                                  23                                                      ]                                                  =                ⁢                  A          T                    
In this four step method, at least two permutation vectors have to be computed and four times of permutation processing using these two permutation vectors have to be perform on the matrix. A special case also exists where a simplified transposed method may be applied if the matrix is a square matrix. A square matrix AN×N is partitioned into an n×n matrix of square submatrices Aij of p×p where N=n*p. That is:
  A  =      [                                        A            00                                                A            01                                    ⋯                                      A                          0              ,                              n                -                1                                                                                      A            10                                                A            11                                    ⋯                                      A                          1              ,                              n                -                1                                                                          ⋮                          ⋮                          ⋰                          ⋮                                                  A                                          n                -                1                            ,              0                                                            A                                          n                -                1                            ,              1                                                ⋯                                      A                                          n                -                1                            ,                              n                -                1                                                          ]  then simply:
      A    T    =      [                                        A            00            T                                                A            10            T                                    ⋯                                      A                                          n                -                1                            ,              0                        T                                                            A            01            T                                                A            11            T                                    ⋯                                      A                                          n                -                1                            ,              1                        T                                                ⋮                          ⋮                          ⋰                          ⋮                                                  A                          0              ,                              n                -                1                                      T                                                A                          1              ,                              n                -                1                                      T                                    ⋯                                      A                                          n                -                1                            ,                              n                -                1                                      T                                ]  
The transpose is accomplished by transposing each Aij and then swapping Aij and Aji. This processing may be accomplished by copying column-wise Aij and Aji into a cache/workspace, respectively, and then reading row-wise from the cache/workspace and storing column-wise into their final destination.
Computing the permutation vector is a task accomplished via serial processing, and may take a significant amount of time depending on the size and shape of the matrix and the block size used to partition the matrix. As parallel computing grows in popularity, this delay becomes even more significant. The time spent in computing the permutation vectors may be longer than that spent on moving the elements. Thus, reducing the number of permutation vectors required to perform the transpose would make the operation much more efficient.
What is needed is a solution which reduces the number of permutation vectors required to perform a transpose of a matrix to allow for better parallel processing of transpose operations.