1. Field of the Invention
The present invention relates to a resource acquisition system and method for distributed computing resources.
2. Description of the Related Art
As described in a document titled “A Resource Management Architecture for Metacomputing Systems”, Karl Czajkowski et al, Proc. 4th IPPS/SPDP Workshop on Job Scheduling Strategies for Parallel Processing, pp. 62-82, 1998, the prior art resource acquisition system for distributed computational resources is essentially composed of a broker, an information service provider and a number of resource allocation managers located in respective administrative domains, all of which are interconnected by a communications network. The information service provider maintains resource information, which is periodically updated by the resource allocation managers. For the management of its local computing resources each resource allocation manager is provided with a gatekeeper, a job manager, a local resource manager and a reporter. In resource acquisition, resource user's data is entered to a client terminal and a resource acquisition request is sent from the terminal via the network to the broker, inquiring it about computing resources that satisfy the client's requirements. The broker acquires necessary information from the information service provider and selects appropriate resources and sends information back to the client, indicating the selected resources.
Next, the client terminal requests one or more resource allocation managers to perform reconfiguration on the selected resources by specifying a particular job such as “starting up of an application program” or “guaranteeing a network bandwidth”. In each resource allocation manager, the gatekeeper is responsible for receiving the client's job and activating the job manager, which in turn hands it over to the local resource manager to perform the job on the target resources. However, the following shortcomings exist in the prior art system.
First, if a resource reconfiguration involves a consecutive series of operations whose executions are constrained to a particular order due to their hierarchically dependent relationships (such as between selection of servers and selection of routes for linking the servers), it usually takes a long time to complete. If one operation fails, subsequent operations cannot proceed and a rollback or compensation must be performed on all reconfigured resources to restore them to original configuration. This represents a substantial waste of time to both users and resource providers.
Therefore, a need does exist to allow users to check in advance to see if all steps of reconfiguration can successfully proceed.
Second, if target resources are scattered over different domains and the intended reconfiguration is such that hierarchically dependent relations exist between different domains, a central management entity, such as the broker, would take responsibility for obtaining resource information from all domains. However, the amount of burden the central management entity would take in selecting resources would become enormous as the distributed computing system grows in size and complexity.
Therefore, there exists a need to allow inter-domain reconfiguration operations to proceed without concentrating processing loads on a single management entity.