1. Field of the Invention
The present invention generally relates to a distributed parallel computer network. More particularly, the invention relates to parallel processing networks in which processes are created (“spawned”) based on the type and nature of the features available in the network.
2. Background of the Invention
A computer generally executes a sequence of predefined instructions (“software”). A “serial” computer, such as most older standalone personal computers and many present day computers, includes a single processing unit that performs calculations one after another (ie., “serially”). The processing unit is usually a “microprocessor” or “central processing unit” (CPU). By contrast, a “parallel” processing computer architecture includes two or more processing units that can execute software instructions concurrently and thereby complete a task generally much faster than with a serial computer.
Parallel processing architectures are particularly well-suited to solving problems or performing tasks that would take an undesirably long period of time to be completed by a serial computer. For example, financial modeling techniques are used to predict trends in the stock markets, perform risk analysis, and other relevant financial tasks. These types of financial tasks generally require a rapid assessment of value and risk over a large number of stocks and portfolios. These tasks include computations that are largely independent of each other and thus readily lend themselves to parallel processing. By way of additional examples, parallel processing is particularly useful for predicting weather patterns, determining optimal moves in a chess game, and any other type of activity that requires manipulating and analyzing large data sets in a relatively short time.
Parallel processing generally involves decomposing a data set into multiple portions and assigning multiple processing units in the parallel processing network to process various portions of the data set using an application program, each processing unit generally processing a different portion of data. Accordingly, each processing unit preferably runs a copy of the application program (a “process”) on a portion of the data set. Some processes may run concurrently while, if desired, other processes run sequentially. By way of example, a data set can be decomposed into ten portions with each portion assigned to one of ten processors. Thus, each processing unit processes ten percent of the data set and does so concurrently with the other nine processing units. Moreover, because processes can run concurrently, rather than sequentially, parallel processing systems generally reduce the total amount of time required to complete the overall task. The present invention relates to improvements in how processes are created in a parallel processing network.
A parallel processing system can be implemented with a variety of architectures. For example, an individual computer may include two or more microprocessors running concurrently. The Pentium® II architecture supports up to four Pentium®II CPUs in one computer. Alternatively, multiple machines may be coupled together through a suitable high-speed network interconnect. Giganet cLAN™ and Tandem ServerNet are examples of such network interconnects. Further, each machine itself in such a parallel network may have one or more CPUs.
One of the issues to be addressed when processing data in a parallel processing network is how to decompose the data and then how to assign processes to the various processing units in the network. One conventional technique for addressing this issue requires the system user to create a “process group” text file which includes various parameters specifying the number of processes to be spawned and how those processes are to be distributed throughout the network. A process group file thus specifies which CPUs are to run the processes.
A process group file implementation requires the system user to have a thorough understanding of the network. The system user must know exactly how many machines and CPUs are available, the type of CPUs and various other configuration information about the network. Further, the user must know which machines or CPUs are fully operational and which have malfunctioned. If the user specifies in the process group file that a malfunctioning machine should run a particular process, not knowing that the machine has malfunctioned, the entire system may lock up when other machines attempt to communicate with the broken machine, or experience other undesirable results. Thus, an improved technique for spawning processes in a parallel processing architecture is needed.