In data processing applications, there is a significant class of computations, described by nested loops. A nested loop includes an inner loop, which performs multiple iterations of a computation, and an outer loop that performs occasional control operations between sets of iterations. These control operations include, for example, addressing adjustments or the extraction of partial answers.
In particular, some nested loops take the form:                for (outer_count iterations)                    outer-pre            for (inner_count iterations)                            inner                                    outer-postwhere ‘inner’ denotes the group of instructions in the inner loop, ‘outer-pre’ denotes a group of instructions preceding the inner loop and ‘outer-post’ denotes a group of instructions performed after the inner loop. The ‘outer-pre’ and ‘outer-post’ groups are allowed to be empty.                        
The inner loop may be executed on a hardware accelerator such as a programmable, very long instruction word (VLIW) computer. Such computers use software pipelining to introduce parallelism into the computation of software loops. VLIW computers allow pipelined implementations of various loop constructs to operate with high throughput. An example of such a computer is the Reconfigurable Streaming Vector Processor (RSVP), which is a statically scheduled VLIW computer that executes dataflow graphs on vector data (data streams) in a highly pipelined fashion.