Many new applications being planned for mobile devices (multimedia, graphics, image compression/decompression, etc.) involve a high percentage of streaming vector computations. In vector processing, it is common for a set of operations to be repeated for each element of a vector or other data structure. This set of operations is often described by a data-flow graph. For example, a data-flow graph may be used to describe all of the operations to be performed on elements of the data structure for a single iteration of a program loop. It may be necessary to execute these operations number of times during the processing of an entire stream of data (as in audio or video processing for example). Computing machines that do this processing would be benefit from a representation of the data-flow graph that can be executed directly.
It would also be beneficial if the representation were expressive enough for execution on a range of computing machines with different parallel processing capabilities. Consequently, the representation must be both a series of computations for linear execution on a sequential computing machine and also a list of operational dependencies within and between iterations for concurrent execution on a parallel computing machine.
In a conventional (Von Neumann) computer, a program counter (PC) is used to sequence the instructions in a program. Program flow is explicitly controlled by the programmer. Data objects (variables) may be altered by any number of instructions, so the order of the instructions cannot be altered without the risk of invalidating the computation.
In a data-flow description, data objects are described as the results of operations, so an operation cannot be performed until the data is ready. Apart from this requirement, the order in which the operations are carried out is not specified.
It is possible to represent the operations of a data-flow graph as a series of operations from a known computer instruction set, such the instruction sets for the Intel x86 or Motorola M68K processors. However, the resulting programs are difficult to execute in a parallel manner because unnecessary dependencies often force serialization of the operations. These unnecessary dependencies arise because all results of operations must be stored in a small set of named registers before being used in subsequent operations. This creates resource contention and results in serialization, even for computing machines that have additional registers. The use of named registers to pass results also obscures the differences between data dependencies within an iteration and data dependencies between iterations. If it is known that there are no dependencies between iterations, then all iterations of a loop can be implemented simultaneously: The parallelism is limited only by the amount of resources on the computing machine.
Consequently, there is an unmet need for a method for describing a data-flow graph that represents both operational dependencies and data dependencies whilst avoiding the use of named registers.