Matrices are utilized in many different applications and fields to solve a wide variety of problems, including fast Fourier transforms (FFT), deep learning, and linear algebra. One of the fundamental operations needed to manipulate matrices is the transpose operation. This operation transforms an M×N sized array into an N×M array by reorganizing the matrix elements by flipping locations along the diagonal. Performing the operation can be computationally expensive and can have significant costs in terms of requisite hardware resources.