There is an increasing demand for microprocessor architectures adapted to meet the requirements of various multimedia processing tasks and algorithms. The quest for increasing performance levels, however, needs to cope with the demands for limiting power consumption and code size growth.
Vectorial and/or SIMD (Single Instruction, Multiple Data) architectures are thus used in applications with massive data parallelism, while VLIW (Very Long Instruction Word) architectures are optimal for applications with high instruction parallelism.
The multi-dimensional microprocessor described in U.S. published patent application no. 2005/0283587 is exemplary of a microprocessor with SIMD/vectorial capabilities based on a VLIW machine. As mentioned in this description, an example architecture for digital media processing was introduced by Intel with their MXP5800/MXP5400 processor architecture. A multi-dimensional microprocessor architecture improves significantly over this more conventional architecture. For instance, in the MXP5800/MXP5400 architecture, processors require an external PCI-based host processor for downloading microcode, register configuration, register initialization, and interrupt servicing. Conversely, in a multi-dimensional microprocessor architecture this task is allotted to one computational unit for each column.
Moreover, if compared against the case of a multi-dimensional microprocessor, the basic computational block in the MXP5800/MXP5400 processors is inevitably more complex. It includes five programming elements, and each of these has its own registers and its own instruction memory. This entails a significant area size and large power consumption, particularly because a power management unit is not used to power down inactive Processing Elements (PEs).
While prior art arrangements as disclosed in U.S. published patent application no. 2005/0283587 are satisfactory for a number of applications, different types of data and instruction processing may co-exist within the same application. The processor core should support them dynamically by adapting its behavior at run time, i.e., while the algorithm is running.