Transpose memories are used to provide an efficient and specifically adapted processing component to transpose N×N matrices. Transposition of N×N matrices is for instance used in two dimensional transform processes such as Discrete Cosine Transform for image and video compression.
Transposition of matrices may be implemented on the basis of conventional single port memory. However, when using conventional single port memory, first all elements of the matrix to be transposed has to be written into the single port memory before the transposed matrix can be read out thereof. This means, the transpose operation requires N×N dead cycles, when assuming that each matrix element is written on each cycle.
In another approach, two conventional single port memories may be used, each for buffering another matrix to be transposed. The data elements of a first matrix may be written into both memories, after which the data elements of the first matrix can be read out in transposed form from one memory and new data elements of a next matrix is being written to the other memory. During the next matrix write/read matrix operation the roles of memories are swapped—such a memory configuration is also designated as ping-pong memory. Conventional single ported random access memories can be utilized for the aforementioned ping-pong memory configuration. Dead cycles are avoided but at the cost of doubled memory capacity requirement.
Hence, there is a need for an improved transpose memory, which operates at a minimized memory capacity requirement and avoids dead cycles undesired to ensure a maximized throughput.