A wide range of applications, such as computer graphics, medical imaging and telecommunications, can utilize signal processing techniques. Signal processing techniques may involve high speed mathematical operations performed in real-time (i.e., the signal may be a continuous function of time that is sampled, digitized, and analyzed in real-time for monitoring or control purposes). Some signal processing operations, such as a discrete cosine transform (DCT) and inverse DCT (IDCT) repeatedly transpose matrices. Other areas may also use transposition operations, such as in linear algebra, spectral methods for partial differential equations, quadratic programming, and the like.
Transposing a matrix using some approaches may consume many clock cycles, such as by repeatedly reading and writing a RAM based cache memory. RAM based approaches can lead to high latency and can incur high cost in terms of power demands. Furthermore, RAM based approaches may present challenges architecturally in reducing, for example, processing pipeline bubbles.