A modern computer system typically comprises a single central processing unit (CPU), and other supporting hardware such as system memory, communications busses, input/output controllers, storage devices, etc. The CPU is the heart of the system. It executes the instructions which comprise a computer program and directs the operation of the other system components.
In the early years of computer development, the CPU was the most expensive part of the system. As a result, systems were constructed around the CPU, to optimize its usage. Multi-tasking systems, capable of serving a number of users performing various tasks simultaneously, were a result of this development history. Multi-tasking allows multiple users and tasks to share the CPU. Although the system may be capable of serving a number of users performing various tasks simultaneously, only one task can be running in the CPU at any instant in time. If a particular task needs the CPU and the CPU is busy, the task must wait. Thus, while multi-tasking permits greater utilization of the CPU, it also means that the CPU is more likely to be a bottleneck to overall system performance.
With the advent of integrated circuits, the cost of processors relative to other system components has declined. As a result, computer systems are being designed with more processors. For example, it has been standard for a number of years to perform certain low level peripheral functions in slave processors, such as disk drive controller processors, workstation controller processors, etc. As the relative cost of such peripheral processors has declined, system designers have expanded their use, reducing the workload burden on the CPU.
In recent years, this availability of inexpensive processors has led to the development of parallel and distributed processing systems, containing multiple processors performing the functions traditionally performed by a single CPU. The processors in such a multi-processor system have separate address spaces, and may have their own storage and their own internal data bus and I/O. The processors may be coupled through shared bus and shared memory, or more loosely via communication networks or other I/O controllers.
A special case of such a multi-processor system is the use of a numeric-intensive co-processor with a general purpose main processor. The architecture of the numeric-intensive co-processor is optimized for performing applications requiring intensive computation (usually floating point), while the main processor is optimized for handling a typical instruction mix of data moves, compares, I/O, etc.
One of the problems with such multi-processor systems is that most programs designed for execution on a computer system are inherently single-thread. As used herein, "single-thread" means that the program contains a single flow of control, whereby at any instant in time, a single sequence of instructions is executing. Such a sequence may loop or jump to a different point in the code, but it always follows a single path. Such a single-thread program is to be distinguished from multiple threads of control, in which program flow may divide, as a road does at a fork, and proceed down both paths simultaneously. A single-thread program does not adapt well to execution on multiple processors.
Where a single-thread program is to be executed on a multi-processor system containing different types of processors, portions of the program must be allocated to the different processors for execution. One alternative is to re-write single-thread code to support a different flow of control, enabling optimization of multiple processors. Certain computer languages support such multi-processing, although only a small fraction of existing computer programs are written in these languages. For example, the SIMULA language supports the use of co-routines, which enable multiple simultaneous threads of program execution. However, this solution is not always possible, and even where possible, re-writing existing code tends to be very expensive.
Another method for allocating portions of a program to multiple processors is the client-server model, which is commonly used in distributed processing systems. Each program part executes on some processor (the client). When it needs the services of another processor (the server), which has different capabilities than the client processor, it issues a request that the server do some work on its behalf. The server returns control to the client when done, along with the results of its work if required. The client-server model allows different processors to cooperate in executing a program, but the degree of cooperation is constrained. The client must provide all information that may be needed to the server before it begins execution, usually before it knows what information will be needed. Existing client-server models are unidirectional in nature; the server lacks capability to issue a call back to the client.
It is desirable to allocate different parts of a program to different processors in a multi-processor system without extensive alteration to the code. In particular, in the case of a system having a general purpose main processor and a numeric-intensive co-processor, it is desirable to execute the numeric-intensive procedures on the co-processor, and other procedures on the main processor. Unfortunately, prior art mechanisms restrict the ability of the system to allocate procedures in an optimal fashion.
It is therefore an object of the present invention to provide an enhanced method and apparatus for executing programs on a multi-processor computer system.
Another object of this invention is to provide an enhanced method and apparatus for allocating portions of a program to different processors in a multi-processor computer system.
Another object of this invention is to increase the flexibility of allocating portions of a program to different processors in a multi-processor computer system.
Another object of this invention is to increase the efficiency of processes running on a multi-processor computer system.
Another object of this invention is to reduce the amount of alteration required of a single-thread program to enable it to run efficiently on a multi-processor computer system.
Another object of this invention is to reduce the cost of executing programs on a multi-processor computer system.
Another object of this invention to provide an enhanced method and apparatus for executing programs on a computer system comprising a general purpose processor and a numeric-intensive co-processor.