1. Field of Invention
The invention relates generally to a method of, and apparatus for, stream scheduling in pipelined hardware. More particularly, the invention relates to a method of, and apparatus for, generating a hardware design for a pipelined parallel stream processor.
2. Background of Technology
Computer systems are often used to implement computational models of a particular physical system, region or event. Commonly, such computational models require iterative numerical calculations to be solved for a large number of data areas or data points. This requires an extremely large number of calculations to be performed; consuming large amounts of computational resources and requiring a significant time period to perform complete the necessary calculations.
Often, a processor such as a central processing unit (CPU) is found in most computing systems. However, whilst a CPU is able to process such calculations, the time period required may be prohibitive unless powerful computing systems are used.
Traditionally, the performance of a computing system has been increased by increasing the operating frequency of the CPU (i.e. by increasing the number of operations the CPU can carry out per second) and by reducing the size of the individual transistors on the CPU so that more transistors can be accommodated per unit area. However, due to power constraints, in the future increasing the CPU frequency may only deliver modest performance improvements. Further, it is becoming increasingly difficult to reduce the size of transistors due to the limitations of lithographic processes and material capabilities.
An alternative approach to increase the speed of a computer system for specialist computing applications is to use additional or specialist hardware accelerators. These hardware accelerators increase the computing power available and concomitantly reduce the time required to perform the calculations. In certain cases, a specialist hardware accelerator may increase the performance of highly parallel applications by over an order of magnitude or more.
One such example of a suitable system is a stream processing accelerator having a dedicated local memory. The accelerator may be located on an add-in card which is connected to the computer via a bus such as Peripheral Component Interconnect Express (PCI-E). The bulk of the numerical calculations can then be handled by the specialised accelerator.
A useful type of stream processor accelerator can be implemented using Field-Programmable Gate Arrays (FPGAs). FPGAs are reprogrammable hardware chips which can implement digital logic. FPGAs comprise no intrinsic functionality and, instead, include a plurality of gates, flip-flops and memory elements which are configurable through use of appropriate software elements.