1. Statement of the Technical Field
The present invention relates to the field of distributed computing, including Web services and grid services, and more particularly to grid service failover.
2. Description of the Related Art
Web services represent the leading edge of distributed computing and are viewed as the foundation for developing a truly universal model for supporting the rapid development of component-based applications over the World Wide Web. Web services are known in the art to include a stack of emerging standards that describe a service-oriented, component-based application architecture. Specifically, Web services are loosely coupled, reusable software components that semantically encapsulate discrete functionality and are distributed and programmatically accessible over standard Internet protocols.
Conceptually, Web services represent a model in which discrete tasks within processes are distributed widely throughout a value net. Notably, many industry experts consider the service-oriented Web services initiative to be the next evolutionary phase of the Internet. Typically, Web services can be defined by an interface such as the Web services definition language (WSDL), and can be implemented according to the interface, though the implementation details matter little so long as the implementation conforms to the Web services interface. Once a Web service has been implemented according to a corresponding interface, the implementation can be registered with a Web services registry, such as Universal Description, Discover and Integration (UDDI), as is well known in the art. Upon registration, the Web service can be accessed by a service requestor through the use of any supporting messaging protocol, including for example, the simple object access protocol (SOAP).
In a service-oriented application environment supporting Web services, locating reliable services and integrating those reliable services dynamically in realtime to meet the objectives of an application has proven problematic. While registries, directories and discovery protocols provide a base structure for implementing service detection and service-to-service interconnection logic, registries, directories, and discovery protocols alone are not suitable for distributed interoperability. Rather, a more structured, formalized mechanism can be necessary to facilitate the distribution of Web services in the formation of a unified application.
Notably, the physiology of a grid mechanism through the Open Grid Services Architecture (OGSA) can provide protocols both in discovery and also in binding of Web services, hereinafter referred to as “grid services”, across distributed systems in a manner which would otherwise not be possible through the exclusive use of registries, directories and discovery protocols. As described both in Ian Foster, Carl Kesselman, and Steven Tuecke, The Anatomy of the Grid, Intl J. Supercomputer Applications (2001), and also in Ian Foster, Carl Kesselman, Jeffrey M. Nick and Steven Tuecke, The Physiology of the Grid, Globus.org (Jun. 22, 2002), a grid mechanism can provide distributed computing infrastructure through which grid services instances can be created, named and discovered by requesting clients.
Grid services extend mere Web services by providing enhanced resource sharing and scheduling support, support for long-lived state commonly required by sophisticated distributed applications, as well as support for inter-enterprise collaborations. Moreover, while Web services alone address discovery and invocation of persistent services, grid services support transient service instances which can be created and destroyed dynamically. Notable benefits of using grid services can include a reduced cost of ownership of information technology due to the more efficient utilization of computing resources, and an improvement in the ease of integrating various computing components. Thus, the grid mechanism, and in particular, a grid mechanism which conforms to the OGSA, can implement a service-oriented architecture through which a basis for distributed system integration can be provided—even across organizational domains.
Within the services grid, a service providing infrastructure can provide processing resources for hosting the execution of distributed services such as grid services. The service providing infrastructure can include a set of resources, including server computing devices, storage systems, including direct attached storage, network attached storage and storage area networks, processing and communications bandwidth, and the like. Individual transactions processed within the service providing infrastructure can consume a different mix of these resources.
Notably, the OGSA defines an architecture in which service instances can be deployed to one or more varying locations within the services grid. Correspondingly, client requests to access an instance of a specific service can be routed to what is considered to be the most optimal instance of the specific service for the request. To that end, individual service instances can be replicated to different nodes in the services grid in a strategic manner based upon optimization criteria. The optimization criteria typically can resolve to nodes having access to specific resources, nodes having service instances which have been co-located with other important service instances, locality with respect to a particular client, and the like.
When a service instance or a node hosting a service instance in the services grid fails, for whatever reason, a failover strategy can become a critical aspect of the operation of the services grid. In this regard, it can be imperative that when a failure has been detected in a node or service instance in the services grid, that subsequent requests to access service functions within service instances in the failed node are re-routed elsewhere in the services grid to other instances of the desired service. Importantly, such re-routing must occur transparently so as to not disturb the virtual organization aspect of the services grid. Still, while failover re-routing is known in the art, little attention has been paid to the re-deployment of a failed service instance in the services grid.