A Very Long Instruction Word (VLIW) processor allows exploiting instruction-level parallelism in programs and thus executing more than one operation at a time. In one VLIW instruction, multiple and independent operations are specified. A VLIW processor uses a set of independent functional units to execute multiple operations in parallel.
Limitations of VLIW processing include limited hardware resources, limited parallelism and a large, increase in code size. Limited hardware resources may be the functional units, the central register file or the communication network. Anticipating these limitations by adding more resources has some serious drawbacks. When increasing the number of functional units, the memory size and register file bandwidth will have to increase as well. Furthermore, a large number of read and write ports are necessary for accessing the register file, imposing a bandwidth that is difficult to support without a large cost in the size of the register file and degradation in clock speed. Increasing the size of the register file may create critical timing paths and therefore limit the cycle time of the processor. Moreover, as the number of directly addressable registers increases, the number of bits to specify the multiple registers within the instructions increases as well.
The scalability of a VLIW processor can be improved by using several register files, i.e. a distributed register file, instead of a central register file. An advantage of a distributed register file is that it requires less read and write ports per register file segment, resulting in a smaller register file bandwidth. The functional units and the distributed register file are coupled by a communication network, which allows passing data produced by the functional units to the distributed register file. Usually, this communication network is partially connected, i.e. not every functional unit is coupled to every register file segment, because the use of a fully connected communication network is too expensive in terms of code size and power consumption, and also results in a decrease of the clock frequency.
In case of a VLIW processor with a distributed register file and a partially connected communication network, it can not be guaranteed that there exists a communication path from every functional unit output to every functional unit input. Therefore, it may turn out that it is not possible for some applications to be run on such a VLIW processor.