This invention relates to parallel processing of data and more particularly to computer-aided specification of parallel computation graphs.
Complex computations can often be expressed as a data flow through a directed graph, with components of the computation being associated with the vertices of the graph and data flows between the components corresponding to links (arcs, edges) of the graph. A system that implements such graph-based computations is described in U.S. Pat. No. 5,966,072, EXECUTING COMPUTATIONS EXPRESSED AS GRAPHS. 
Referring to FIG. 1A, an example of a computation graph 100 includes an input file 110 and an output file 140. Input file 110 is the source of a series of work elements, such as data records each associated with a separate transaction in a transaction processing system. Each work element is first processed by a component A 120, passed over a serial link 125, and then processed by a component B 130. The outputs of component B are stored in output file 140.
It can be desirable to implement a computation graph using multiple instances of individual components. For example, each instance of a component may be hosted on a different processor, thereby achieving a coarse-grain parallelism that provides an overall increase in computation capacity. Referring to FIG. 1B, a specification of a parallelized computation graph 101 includes input file 110 and output file 140 as in the serial computation graph 100. A parallel component A 121 represents m instances of component A 120 arranged in parallel, and a parallel component B 131 represents m instances of component B 130 represented in parallel. A parallel link 126 joins parallel component A 121 and parallel component B 131. In the representation of parallel computation graphs, such as the one in FIG. 1B, parallel components are indicated using bold lines, and optional indicators of the degrees of parallelism (e.g., “m” in FIG. 1B) adjacent to the components.
Referring to FIG. 1C, parallelized computation graph 101 is represented in explicit serial form, with m instances of component A 120 (labeled A1 through Am) arranged in parallel. In order to distribute work elements from input file 110, a 1:m partition element 115 is inserted between input file 110 and the m instances of component A that make parallel component A 121, which includes the m instances of component A 120. Partition element 115 takes work elements on one input, and sends each input to one of the m outputs, for example, in a round-robin manner. A m:1 gather element 135 takes the outputs of the m component Bs 130 on m input lines and merges the inputs, for example according to their arrival times, for output to output file 140. The partition element 115 and gather element 135 provide similar functionality in the example of FIG. 1B, though the multiple instances represented by the parallel components are not explicit in FIG. 1B as they are in FIG. 1C. Parallel link 126 shown in FIG. 1B is represented in this example of FIG. 1C as a parallel combination of serial links joining corresponding instances of component A and component B.