The present invention relates to executing software on a parallel, distributed data processing system, and more particularly relates to performing a software operation on one or more nodes of a parallel, distributed data processing system.
U.S. Pat. No. 5,359,730 issued Oct. 25, 1994 to Marron for METHOD OF OPERATING A DATA PROCESSING SYSTEM HAVING A DYNAMIC SOFTWARE UPDATE FACILITY and discloses non-disruptive installation of updated portions of a computer operating system while that operating system continues to run while simultaneously supporting application load on the system.
U.S. Pat. No. 5,421,009 issued May 30, 1995 to Platt for METHOD OF REMOTELY INSTALLING SOFTWARE DIRECTLY FROM A CENTRAL COMPUTER and deals with remote installation of software on a computer system. Disclosed is a method of installing a client portion of client-server software on client nodes without first manually preparing those client nodes with any type of software such as download software.
U.S. Pat. No. 5,471,617 issued Nov. 28, 1995 to Farrand et al. for COMPUTER MANAGEMENT SYSTEM AND ASSOCIATED MANAGEMENT INFORMATION BASE and discloses a method of managing a plurality of networked manageable devices, with a management information base for use in managing hardware objects.
U.S. Pat. No. 5,555,416 issued Sep. 10, 1996 to Owens et al. for AUTOMATED SOFTWARE INSTALLATION AND OPERATING ENVIRONMENT CONFIGURATION FOR A COMPUTER SYSTEM BASED ON CLASSIFICATION RULES and is directed to remote, automated, rules based installation to automatically install software products on a computer system, and configure the operating environment of the computer system.
AIX NETWORK INSTALLATION MANAGEMENT GUIDE AND REFERENCE, SC23-1926-00, available from International Business Machines Corporation, provides information about managing the installation and configuration of software by using a network interface. Network Installation Management (NIM) enables the centrally managed installation of the AIX base operating system, the IBM version of the UNIX operating system, and optional software on machines within a networked environment.
The installation of operating system software on parallel, distributed computing system hardware is typically a complex and time consuming procedure. For a modern, full-functioning operation system such as the AIX operating system, numerous files must be placed on the system. As well, numerous files must be newly created or updated, numerous procedures must execute to successful completion on the involved systems, and other complex functions must be completed. NIM provides the base function to install a single system remotely, that is without requiring direct interaction with the target system. The IBM Parallel System Support Program (PSSP version 2.1) utilizes NIM to provide parallel, remote installation of multiple systems. PSSP installation provides automated installation of multiple systems from a single point of control. Much of the PSSP function is embodied in a single program which invokes numerous NIM, Kerberos and other PSSP functions to configure the installation server system to prepare it to install its client system(s).
However, due to the complexity of the installation process, the networking requirements of both the master and client (target) systems, and the complexity of configuring the installation server system, the installation of a remote system can fail for any of a large variety of reasons. In particular, because the installation server configuration function is contained within a single program which does not record the various states through which the server has progressed, if the installation fails the server and client systems can be left in such a state as to require significant detailed analysis and manual intervention to restore the systems to their previous states. It is not always possible to correct the initial problem and rerun the program because the various states through which the server is progressed are not recorded. Thus, it takes careful analysis and effort to restore the server to its original state. Even in cases where the server configuration program can be rerun, it consumes unnecessary time and resources to rerun all configuration steps when only the remaining steps need be completed.
The present invention builds upon the existing base of NIM and PSSP version 2.1. The single installation server configuration program is replaced with a collection of single-function programs, each of which performs a single configuration step. The single-function programs are referred to herein as xe2x80x9cwrappersxe2x80x9d. A xe2x80x9cwrapperxe2x80x9d is a program or script which is xe2x80x9cwrappedxe2x80x9d around a single function (e.g., a standalone NIM commend) which provides additional state and error checking before and after that single function. This additional checking makes the xe2x80x9cwrapperxe2x80x9d more suitable for use in automated scripts. Each of these new single-function programs acts independently to ensure prerequisite conditions are met, perform a single configuration step, and leave the installation server in a specific state for subsequent use by a succeeding program. A new xe2x80x9coverallxe2x80x9d server configuration program called xe2x80x9csetup-serverxe2x80x9d invokes each of these new single-function programs in the correct order. The administrator is now free to invoke the single-function programs at will in whatever order he deems necessary and is appropriate, thus making the parallel, remote installation much more flexible and eliminating unnecessary steps.
The modular approach greatly aids remote, parallel installation by:
Making remote parallel installation more flexible. The administrator is now free to invoke only the necessary single-step programs. This makes it much easier to recover from an installation error.
Reducing the effort required to remotely install a single system (node). The new single-function programs allow for specific identification of the server or target system from which to perform the configuration operation. This can result in a significant reduction of time and resources when the administrator needs to install a single (or small number) of remote systems.
Reducing overall (re)installation time in the event of an installation failure. After correcting the problem, the administrator can complete the installation by simply rerunning the remaining steps, bypassing the previously completed steps.
Improving reliability. Since each component standalone program performs its own state analysis and error checking, errors are caught sooner and are easier to diagnose and correct.
The present invention provides parallel, remote migration where migration is defined as the ability to upgrade the operating system to a later release while preserving user data. Parallel migration is the ability to migrate many nodes simultaneously. Remote migration is the ability to initiate the node migration from any node in the system. By using resources strategically copied throughout a parallel, distributed computer system, the invention allows for wholesale migration of nodes from one release of AIX to another.
The present invention provides modular installation with repeatable, externally-invocable steps. Modular installation refers to the ability to define networks, resources and clients in relatively small software steps. Each step checks for entry conditions, and if met, executes the main body of the step. Upon completion of the main body, it checks for exit conditions. If exit conditions are met, it exits to the user. If not met, it undoes any partially successful steps and exits. Modular installation provides the ability to break up the installation of many workstations into easily repeatable steps. If any step fails, the system is left in an easily correctable state, and the step is ready to be rerun.
It is thus an object of the present invention to provide a program product recorded on a computer readable medium which includes a method of performing a software operation on a target of one of more processors in a distributed processing system wherein another processor is designated as a server. The method includes running a configuration program on the server to condition the server to serve on the target, the software operation which includes resource creation and object definitions; testing entry conditions in the configuration program for determining if entry conditions are met to serve the software operation on the target; if the entry conditions are met, serving the software operation on the target; at the completion of the software operation, testing exit conditions in the configuration program for determining if the software operation on the target completed successfully; if the exit conditions are met, exiting the software operation; returning to the configuration program to serve a second software operation on the target; and repeating until all software operations are served on the target.
It is another object of the invention to provide a program product which includes a method that issues an error message to the server and exits from the software operation if the entry conditions are not met.
It is another object of the present invention to provide a program product which includes a method that undoes any partial resource creation of object definition performed during the software operation, issues an error message to the server, and exits from the software operation if the exit conditions are not met.
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of the preferred embodiment of the invention as illustrated in the drawings.