1. Field of the Invention
The present invention generally relates to data processing and more particularly to a method for process migration based on service availability in a multi-node environment.
2. Description of the Related Art
Powerful computers may be designed as highly parallel systems where the processing activity of hundreds, if not thousands, of processors (CPUs) are coordinated to perform computing tasks. These systems are highly useful for a broad variety of applications, including financial modeling, hydrodynamics, quantum chemistry, astronomy, weather modeling and prediction, geological modeling, prime number factoring, and image processing (e.g., CGI animations and rendering), to name but a few examples.
For example, one family of parallel computing systems has been (and continues to be) developed by International Business Machines (IBM) under the name Blue Gene®. The Blue Gene/L architecture provides a scalable, parallel computer that may be configured with a maximum of 65,536 (216) compute nodes. Each compute node includes a single application specific integrated circuit (ASIC) with 2 CPU's and memory. The Blue Gene/L architecture has been successful and on Oct. 27, 2005, IBM announced that a Blue Gene/L system had reached an operational speed of 280.6 teraflops (280.6 trillion floating-point operations per second), making it the fastest computer in the world at that time. Further, as of June 2005, Blue Gene/L installations at various sites world-wide were among five out of the ten top most powerful computers in the world.
In a multi-node or highly distributed environment, security can be implemented by assigning tasks or jobs to a certain set of nodes that are within the cluster. In a massively parallel computing system, like a Blue Gene system, it is often necessary to assign pools of nodes to perform different tasks. For example, a database application might assign a first nodal pool to receive and database requests to a second nodal pool configured to perform manipulation on a data set. Carrying on with this example, a third nodal pool could be tasked with writing any results to the database.