Despite the rapid increase in the power of individual computer processors, there are many present and potential applications which could benefit from much greater computing power than can be provided by any individual present or foreseeable processor. The major approach to such greater computing power is to use parallel computers, that is, computers having more than one processor. Many different types of parallel computers have been designed, ranging from Symmetric Multi-Processing systems, in which each of multiple processors and some cache memory share main memory and all of the computer's other resources, to so-called shared-nothing systems, where each processor has its own separate, often relatively large, main memory, often has its own mass storage device, and is connected to other processors only by a computer network. The number of processors in current parallel computers vary from two to tens of thousands.
Parallel computers can provide a huge amount of raw computational power, as measured by all of the instructions per second which their multiple processors can execute. The major problem restricting the use of parallel computing has been the difficulty in designing programming techniques that can efficiently and reliably use the computational power of parallel computers to perform useful tasks.
This problem arises for multiple reasons. First, most useful tasks which one might want to accomplish through a parallel computer require that processes be distributed to the various processors, or nodes, of the computer and that those processes then communicate with each other. This requires that the code for a process be made available to the node that it is to run on, that a command be given to run that process on that node, that the process determine the nodes on which all other processes it is to talk to are running on, and then that it establish communication links with those other processes. If a given individual task is to be parallelized, a decision has to be made as to which portion of the data to be processed by that task should be routed to each of the processes that is executing it. In addition there are many other details that have to be attended to for a task of any reasonable complexity to be programmed to run on multiple processors. Thus, it has traditionally been a very complex task to write programs for parallel computers.
Not only is it difficult to write programs for parallel computers, but it can also be extremely difficult to make such programs work properly. This is because the execution of a parallel program, instead of involving only one process, as do most programs, involves many different processes, each of which might run at differing rates and have differing behaviors each time the program is executed. This means that there are all sorts of synchronization problems which can result between processes; it means that execution is much more complex and, thus, more difficult to fully understand; and it means that finding errors in parallel programs, that is, debugging them, can be much more complex.
Over the years there have been many attempts to deal with the problem of programming parallel computers. One approach has been to design parallel programming languages having constructs designed to facilitate the description of all the complexities necessary for parallelization. But even with such languages, the complexity of parallel programming remains considerable. Another approach has been to have parallel compilers which take code which could run in a single process and automatically parallelize it to run on multiple processors. While such compilers do a very good job of removing the complexity of parallelization from the programmer, they usually make very inefficient use of a parallel computer. This is because such compiler's parallelization mechanism are very general, and, thus, they are often ill suited to provide efficient parallelization for a particular piece of code.
Parallel relational data base management systems (herein "RDBMS"s for short) use another approach to dealing with the complexity of parallel programming. Such systems enable a user to issue a statement in a data base query language, such as Structured Query Language, or SQL. The system then parses this statement and automatically derives from it a corresponding data flow graph which is executed in a parallel manner. The data flow graph is comprised of a sequence of one or more operators, each of which has an associated subroutine, some of which are parallelizable. The graph connects together the operators of the graph with data flow links through which records from the data base are processed. The RDBMSs automatically parallelize the graph, causing a separate instance of individual parallelizable operators in the graph to be run on each of a plurality of nodes. Different partitions of the data base table can be fed through the data links to different instances of the same operator, defining a multi-branched tree. Such RDBMS systems make good use of parallelism, but their capabilities are limited to reading from and writing to parallel data bases in response to statements in a RDBMS language. Generality is thereby restricted.