When a program is executed on a multiple resource system, many resources may be used to execute the program to completion. Depending on the requirements of the program, different kinds of resources may be required to run the program to completion. For example, a program requires processing resources for executing the program and for manipulation of data; it also requires memory to store intermediate and final results; and it may require resources of a file system. A program may be constructed so that multiple resources of the same kind can be used in order to speed up the program execution or to handle larger problem sizes and/or larger data sets. The resources used by a program may be allocated at the beginning of program execution or may be allocated during the course of execution just prior to its use. For example, all memory used by a program during the course of execution might be allocated only once at the beginning of program execution or instead might be allocated during execution just prior to generation of data and then deallocated when no longer necessary. Resources may be requested by a program explicitly or implicitly. In an explicit allocation, the program makes a request for specific resources explicitly; for example, a program may request a certain amount of memory or may request a specific set of processors prior to commencing the computations. Implicit resource allocation takes place as a result of use or allocation of some other resource (which may have been allocated explicitly or implicitly). An example of implicit allocation is the allocation of additional pages (in a virtual memory environment) as a result of explicit memory allocation by the program. Another example is the case of implicit allocation of a larger amount of memory to a parallel program when it is run on a larger number of processors in a distributed memory environment.
In the context of a uniprocessor environment, allocation of resources to a sequential program is well defined. Resource allocation requests can be made explicitly by an application program or can be inferred by a compiler from language constructs. A run-time environment or the operating system can make these resources available dynamically either in anticipation or on demand. In a uniprocessor environment resources have well defined boundaries; i.e., the number of resources of each kind is fixed (usually one of each kind) and the size of a particular resource (e.g., real or virtual memory size) is also fixed. Compilers and run-time systems (operating system and extensions of it) can take advantage of these predefined resource boundaries and optimize the execution of the code and/or the utilization of the system resources.
In a parallel execution environment, the word "resource" takes on a broader meaning and gives rise to a more demanding apparatus to manipulate resources. The kinds of resources encountered are physical resources such as processors, memory, interconnects, devices for i/o and mass storage, visualization and other special purpose instruments, etc. Typically, the resources are shared by multiple parallel applications. Moreover, it is desirable that programs written for such environments be able to adapt to the scalable nature of these parallel environments. Because of these considerations, in a scalable multiprocessor environment, the resource boundaries cannot be fixed at the time applications are written or even at compile time. For example, the number of processors on which an application may be run cannot be fixed a priori, or it may not be desirable to do so in order to realize the flexibility associated with scalable architectures. Furthermore, it has been observed that the data input to an application can have a large impact on the performance of the computations since concurrency and data distribution are both affected by the particular problem being solved. See J. Saltz, H. Berryman, and J. Wu, Multiprocessing and Run-Time Compilation, "Concurrency: Practice and Experience", vol. 3(6), pp. 573-592, December, 1991. In such cases, the actual resource requirements to solve a problem to completion can be known only after the inputs to the problem are defined and the utilization of these resources may be determined only during the course of the program execution. When multiprocessor systems are multiprogrammed, a new dimension is added to the scheduling problem as multiple parallel jobs compete dynamically for resources. In some research systems, as discussed in C. Polychronopoulos, "Multiprocessing versus Multiprogramming", Proceedings of the 1989 International Conference on Parallel Processing, Aug. 8-12, 1989, pp. II-223-230; A. Gupta, A. Tucker, and L. Stevens, "Making Effective Use of Shared-Memory Multiprocessors: The Process Control Approach", Technical Report CSL-TR-91-475A, Computer Systems Laboratory, Stanford University, 1991; and S. Leutenegger and M. Vernon, "Multiprogrammed Multiprocessor Scheduling Issues", Research Report RC-17642, IBM Research Division, February 1992, resources are rearranged during the lifetime of a parallel job. In the presence of multiple applications, all vying for the same resources, some form of efficient dynamic scheduling of resources is essential.
The scalable nature of parallel environments requires that an application be able to adapt to a particular configuration of the underlying system whenever it is invoked to solve a particular problem. Not only should the program as a whole be able to reconfigure, but to achieve flexibility and efficiency, the components of a program should be reconfigurable with respect to one another. For example, for any specified level of resources, the program data structures may have to be distributed suitably and the bounds for loops executed by each processor may have to be adjusted accordingly.
In summary, parallel applications developed for scalable systems with multiple resources have the following important characteristics:
i. Dynamism: Resource requirements change dynamically during the course of computations. PA1 ii. Reconfigurability: Each stage of computations can be designed to operate under multiple levels of resources. PA1 iii. Shareability: Applications often are required to share data and physical resources. PA1 i. Dynamism: It should be possible to acquire and release resources dynamically on demand. PA1 ii. Reconfigurability: It should be possible to reconfigure the allocated resources to individual applications. PA1 iii. Shareability: It should be possible to dynamically partition the resources both in space and time. PA1 i. A scheme for annotating and instrumenting application program segments such that (a) each segment can operate at multiple resource levels, and (b) the program segment can be reconfigured at run-time to execute with a specified resource level. These annotations and the associated instructions may be generated by a programmer, by a pre-processor, by a compiler, by a library call, by a run-time system, or by combination of these. All are within the scope of this invention. PA1 ii. A run-time system that monitors the progress of a program and provides an interface to a user and/or to a system-wide resource coordinator at each point in the program at which resource revisions and reconfigurations are amenable. PA1 iii. A run-time system that takes a given allocation of resources during the course of a program execution and reconfigures the data and control structures as dictated by the annotations.
Thus, any resource management system for controlling of the resources associated with a parallel environment must have the following characteristics:
To realize these characteristics, it is necessary for the end users and for the system itself to monitor the resources allocated to each segment of computations in an application and to steer the program to maximize the respective performance goals. In view of the above, a run-time system is necessary to integrate the resource manager and the parallel application in an intelligent manner.