The present invention relates to a computer system and a method for parallel processing program instructions.
Many parallel architectures start by defining the programming model they support as strictly functional. Computer architects find that the strictly functional approach makes their task easier, because the order in which computation is performed, once the basic dependencies are satisfied, is irrelevant. Building highly parallel machines which execute strictly functional languages then becomes relatively straightforward. Strictly functional languages certainly have advantages, including the relative ease of proving their correctness and the ease of thinking and reasoning about certain simple kinds of programs, as is disclosed in Backus, "Can Programming be Liberated from the von Neumann Style? A Functional Style and its Algebra of Programs" Communications of the ACM Vol. 21 no. 8 pp. 613-641 (August 1978).
Unfortunately, there is a class of algorithms for which the strictly functional programming style appears to be inadequate. The problems are rooted in the difficulty in interfacing a strictly functional program to a non-functional outside environment. It is impossible to program a task such as an operating system, for instance, in the unmodified strictly functional style. Clever modifications of the strict functional style have been proposed which cure this defect. For example, McCarthy's AMB operator, which returns whichever of two inputs are ready first, appears to be adequate to solve this and similar problems. Several equivalent operators have been proposed in Peter Henderson, "Is It Reasonable to Implement a Complete Programming System in a Purely Functional Style? " Technical memo PMM/94, The University of Newcastle upon Tyne Computing Laboratory (December 1980).
However, the introduction of such non-functional operators into the programming language destroys many of its advantages. Architects can no longer ignore the order of execution of functions. Proving program correctness again becomes a very much more difficult task. Users also find the programming with such operators is tedious and error prone. In particular, as shown by Agha in "Actors: A Model of Concurrent Computation in Distributed Systems" M.I.T. Artificial Intelligence Laboratory Technical Memo 844, (1985), the presence of such an operator allows the definition of a new data type, the cell, which can be side effected in just as normal memory locations in non-functional languages. Thus the introduction of the AMB operator, while adequate to solve the lack of side effect problem, re-introduces many of the problems which the functional language advocates are attempting to avoid.
Therefore conventional thought indicated the choice of either defining weak systems, such as purely functional languages, about which one can prove theorems, or defining strong systems which one can prove little about.
Much more serious than any of the theoretical difficulties of strictly functional languages, is the difficulty programmers face in implementing certain types of programs. It is often more natural to think about an algorithm in terms of the internal state of a set of objects as, for example, discussed in chapter 3 of Abelson and Sussman, Structure and Interpretation of Computer Programs, M.I.T. Press, Cambridge, (1985). This object oriented viewpoint in programming language design is incompatible with the strictly functional approach to computation. The same algorithms can be implemented in either style, and both systems are surely universal--but the fact that programmers find one representation for programs easier to think about, easier to design, and easier to debug is, in itself, a powerful motivation for a programming language to provide that representation.
If this were merely a representation issue, there would be hope that suitable compiler technology could eventually lead to a programming language which provides support for side effects, but which compiles into purely functional operations. As has been discussed above, however, in the general case this is not possible.
Side effects cause problems. Architects find it difficult to build parallel architectures for supporting them. Verification software finds such programs difficult to reason about. And programming styles, with an abundance of side effects, are difficult to understand, modify and maintain.
In accordance with the present invention, instead of completely eliminating side effects, it is proposed that their use be severely curtailed. Most side effects in conventional programming languages like Fortran are gratuitous. They are not solving difficult multi-tasking communications problems. Nor are they improving the large scale modularity or clarity of the program. Instead, they are used for trivial purposes such as updating a loop index variable.
Elimination of these unnecessary side effects can make code more readable, more maintainable, and as set forth hereinafter in accordance with the invention, faster to execute.
Many of these same issues motivate Halstead's Multi-Lisp, a parallel variant of Scheme as disclosed in "MultiLisp: A Language for Concurrent Symbolic Computation " ACM Transactions on Programming Languages and Systems, (1985). The approach Multi-Lisp takes to providing access to parallel computation is the addition of programmer visible primitives to the language. The three primitives which distinguish it from conventional Lisp are:
The future primitive which allows an encapsulated value to be evaluated, while simultaneously returning to its caller a promise for that value. The caller can continue to compute with this returned object, incorporating it into data structures, and passing it to other functions. Only when the value is examined is it necessary for the parallel computation of its value to complete;
the pcall primitive which allows the parallel evaluation of arguments to a function. Halstead teaches that pcall can be implemented as a simple macro which expands into a sequence of futures. It can be thought of as syntactic sugar for a stylized use of futures; and
the delay primitive which allows a programmer to specify that a particular computation be delayed until the result is needed. Similar in some respects to a future, delay returns a promise to compute the value, but does not begin computation of a value until the result is needed. Thus the delay primitive is not a source of parallelism in the language. It is a way of providing lazy evaluation semantics to the programmer.
Both delay and future result in an order of execution different from applicative order computation. In the absence of side effects, both will result in the same value as the equivalent program without the primitives, since they affect only the order in which the computation is performed. This is not strictly true for delay since its careful use can allow otherwise non-terminating programs to return a value.
In the presence of side effects, it is difficult to predict the behavior of a future, since its value may be computed in parallel with other computations.
While the value of a delay is deterministic, since it does not introduce additional parallelism, its time of computation is dependent on when its value is first examined. This can be very non-intuitive and difficult to think about while writing programs.
Halstead implements the future and delay primitives by returning the caller an object of data-type future. This object contains a pointer to a slot, which will eventually contain the value returned by the promised computation. The future may be stored in data structures and passed as a value. Computations which attempt to reference the value of the future prior to its computation are suspended. When the promised computation completes, the returned value is stored in the specified slot, and the future becomes determined. This allows any pending suspended procedures waiting for this value to run, and any further references to the future will simply return the now computed value.
The use of future or delay can be thought of as a declaration. Their use declares that either the computation done within the scope of the delay or future has no side effects, or that, if it does, the order in which those side effects are done relative to other computations is irrelevant. There is also a guarantee with such a declaration that no free or shared variable referenced by the computation is side effected by some other parallel-executing portion of the program.
Like all declarations, the use of future or delay is a very strong assertion. In a way similar to type declarations, their use is difficult to check automatically, is error prone, has a significant performance impact if omitted, and may function correctly for many test cases, but fail unexpectedly on others.
Advocates of strong type checking in compiled languages have attempted for years to build compilers capable of proving the type correctness of programs. It is believed that they have failed. All languages sophisticated enough for serious programming require at least some dynamic checks for type safety at execution time. These checks are implemented as additional instructions on conventional architectures, or as part of the normal instruction execution sequence on more recent architectures as discussed in Moon, "The Architecture of the Symbolics 3600" The 12th Annual International Symposium on Computer Archecture pp. 76-83 (1985). One alternative, declaring the types of relying on the word of the programmer, while leading to good performance on conventional architectures, is dangerous, error-prone, and inappropriate for sophisticated modern programming environments.