Signal processing systems, including those for video, audio and graphics, for example, use interface paths to transmit data from a media source or sources and/or a high capacity storage medium to a signal processing subsystem. The data received in the signal processing subsystem will typically be stored locally in a number of different patterns. From this local storage, the data will be accessed for algorithmic processing. These data patterns may not be in the best order for efficient algorithmic processing. In addition, when processing the data with a series of algorithms, each algorithmic stage of processing may produce results in a pattern that is not in an efficient order for the next stage of processing. The result is that a considerable amount of time can be spent by the processing system reordering data to fit the algorithms that are used. This inefficiency causes a loss in performance and an increase in power utilization.
There are many signal analysis techniques that make use of matrix and data sorting operations and could make advantageous use of data swapping or exchange type operations. In a processor, a swap operation can be specified to read the contents of two registers and then write the data values to the swap address. For efficient programming when using register files or local memories, it can be advantageous to additionally provide the ability to swap contents of groups of locations. For example, swapping a block of data, providing the transpose of a matrix stored in either registers or memory, implementing permutations on a set of registers, and the like, are all examples of algorithmic capabilities which are desirable to efficiently support.