In any data processing apparatus central processing unit a critical speed path involves reading a register file to get data, operating on the data and writing the results back to the register file. The register file read and write delay reduces the speed of the processor. Register file bypass removes this problem by providing a second route for the data used by the functional units. The result data from a functional unit is routed to the register file as well as directly to a functional unit operand input if the results data written is required in the immediately following central processing unit cycle.
Register file bypass solves this speed problem but introduces other problems. A new problem created by register file bypassing is detecting when this bypass should be triggered. In a in a very long instruction word (VLIW) data processor this detection requires on the order of n2 circuits, where n is the number of ports of the register file. This detection logic must provide a path from any register file port to any register file port. This requires a new level of complexity and cost. In a VLIW central processing unit with four 2-input functional units a total of 4×2 bypass networks are needed. Generally about 40% to 50% of all results data have a register lifetime of a single cycle. Thus nearly half of the time a value written to a register file is read only once in the next following central processing unit cycle. Thus much of the detection and forwarding logic required by register file bypassing is wasted. In addition the detection and forwarding logic presents a speed path to the predication feature, or ability to abort an instruction. Thus known register file bypass techniques are costly in terms of integrated circuit area, power use, cost and operation. Most prior art designs use either register file bypass or simply use circuit design techniques to minimize the problems.