Very long instruction word (VLIW) processor datapaths have traditionally depended on a large centralized register file to store variables for all functional units for the machine. This is not scalable as the issue width of the machine grows. The Texas Instruments C6000 family of digital signal processors tackled this problem by partitioning the datapath into 2 clusters. An implicit cross-cluster path connection used implicitly when a functional unit from a cluster sourced/wrote an operand from the register file of the other cluster provided inter-cluster communication. This does not scale well to higher number of clusters used to achieve even wider issue width. Transferring data between a large number of clusters of functional units is limited by the interconnect delays in deep submicron silicon processes. Implicit operand transfer with short latency functional unit operation impedes performance for a large number of functional unit clusters. Thus there is a problem in the art concerning inter-cluster communication and register file structure in wide issue VLIW processors.