Message passing is used for the solution of parallel programming problems or requests. In this use, a network of processes use such a message passing system to communicate to solve the problem.
In the current technology of messaging and parallel development support, there is one technique for coordinating such parallel processing topologies, called "scripting." This technique is provided by Workflow products.
Scripting provides a technique for describing the topology and conditions for completion to solve a particular problem or flow of work. The script is interpreted and each step may initiate a flow of messages to a set of target tasks (or processes) in the next step of the topology. The script ends with the description of the conditions for completion. In this way, scripting describes one and only one static topology for the solution of a given problem. Though awareness of the topology is not required in each of the tasks, none of the tasks can conditionally change the topology. This means that variations of a given script must be created for these conditions, and the conditions must be determined prior to execution so that the proper script for this unique problem can be chosen. This can require considerable effort that in many cases would have been best done when the task encounters the condition during execution.
There are also a number of roll your own techniques, which are described below. In general, these techniques are designed for single stage parallelism and predetermined topologies. These techniques provide complicated and easily broken methods for determining the completion of parallel tasks. Using current message passing systems, much of the awareness of the network topology is "hardcoded" in the tasks. Changes to the topology require changes to the underlying tasks. The current technology is:
1. To create a set of slots which accommodates the predetermined number of expected results for the topology. Each task would place its result in a predetermined slot. This design requires that only one set of results can be collected for one problem. Otherwise, the result of a subsequent problem might end up filling up or overwriting a slot for the prior problem. In this methodology, the originating tasks would have to coordinate the initiation of the next problem with the completion of the current problem. This technique is called "cycling." PA1 2. To reproduce the entire network of processes and queues for each request. In this methodology, each problem has its own tasks, queues and slots. This solution limits the number of concurrent problems that can be worked on because the system must replicate the storage available for all the problems and queues used to solve each instance of the problem. PA1 3. To place all results in one queue. The initiating process identifies each new problem that it starts and passes that identifier with the message to all of the parallel processes. Each result is stored in the next available slot, not a predetermined slot. A completion checking task looks through all of the collected results for all of the problems in process and counts the number of results for a given problem. When a specified number of results are found for that problem, it is considered complete. Some task is then notified to take the results off the queue and process them. Clearly, this solution has efficiency and resource problems. This solution has some obvious contention problems and possible resource problems. If a change is made to the number of results expected for a given problem, it is now impossible to alter the topology dynamically, as an alteration would affect problems of this type that are currently "in flight." The completion checking routine would have to distinguish between results from problems initiated before and after the alteration. PA1 This problem is usually solved by deferring the initiation of all new problems of this type until the last in flight problem completes, then one informing the completion task of the change in the number of expected results for this problem. Then, new problems of the given type can be initiated and results checked. This is called "cycling the network."
Based on the foregoing, a need exists for an improved message processing facility that allows for the widening of parallel execution of messages. Further, a need exists for a facility that can coordinate the results of a network of processes that has been widened in its parallelism without creating subproblems. Additionally, a need exists for a facility that enables a message to know what request it pertains to and where in the processing sequence of that request it belongs.