The growth of network services, for example Internet services or intranet services, has made significant demands on the availability and performance of Internet and intranet sites and the computer servers supporting the sites. Growth in the demands is related to increasing numbers of users, increasing complexity of applications, and increasing demands for better service. To address performance and reliability issues associated with the growth in demand, the sites use one or more switches to assign requests from multiple users to multiple servers.
Users access the network services using a client having a browser. The browser provides a user interface between the user and the client, and the sites. Typically, a user is permitted to run a single business session (e.g., a shopping cart) on a single browser. If the user wants to run a new business session (e.g., a new shopping cart), the user typically needs to end the current business session on the browser and start a new business session on the browser. The user may also run the new business session by opening a new browser.
Some software applications support running multiple, concurrent, business sessions on a single browser. A challenge in implementing these applications is determining how to assign each business session to one of the multiple servers. The server holds state information (otherwise referred to as “stateful information”) related to user requests for one or more business sessions on behalf of the client. Executing stateful business sessions on more than one server can cause the servers to fail to retrieve the correct information, since the desired information might reside on a different server. For example, when running a shopping cart business session on two servers, each server may only have part of the orders in the shopping cart.
Prior systems implemented server assignments at different levels by using different methods, such as those based on an internet protocol (IP) address, a session cookie, and a universal resource locator (URL) session identification (ID).
The IP address method provides assignment of a server at the client level. A content switch balances the load depending on different IP addresses (and/or port number) of a client. When each client has an independent, different IP address, the load can be balanced among the servers and the business sessions from the same client can be assigned to the same server.
The session cookie method provides assignment of a server at the browser or user level. The session cookie is an identifier passed together with a client request to a server to identify a session and a corresponding request. With the session cookie, the server can know which session the request is from. The content switch detects the session cookie from a user's browser and assigns (i.e., “sticks”) the requests from the same Hyper Text Transfer Protocol (HTTP) session to a server. If the cookie timeout is not set, the session cookie will be available until a user closes a browser. Thus, the requests from the newly opened browser can be re-distributed among the servers. If the cookie timeout is set, when this user session ends, the HTTP requests from the browser are re-distributed.
The URL session ID method provides assignment of a server by using a business session ID as a parameter of the URL. This method may involve having a dedicated server that generates the business session IDs and assigns a business session ID for each new business session. Hence, a client requests a new business session ID before starting each new business session, which generates additional communication between the client and the server. The client who has requested to start the business process receives the business session ID, and includes the business session ID as a parameter in the URLs that start the business session or make subsequent requests. The content switch assigns these request to a server based upon the evaluation of the business session ID by a sorting method in the content switch.
Load balancing permits the network load to be distributed dynamically and efficiently to each of multiple network service servers according to its status. Since loads are balanced based upon information from the clients or users, the load may not be evenly distributed.
In recent years, as network services have increased with the rapid spread of Internet/Intranet, the demand has increased for more efficient utilization of the client server system and increasing the stability of services of servers. In particular, there is a demand for an environment, which permits centralized access to the World Wide Web (WWW) server to be circumvented and failures to be hidden. For this reason, some systems provide two or more servers (or nodes) to perform one service (e.g., ftp (file transfer protocol), HTTP (Hyper Text Transfer Protocol), telnet, or the like).
In order to implement services with stability, it is required to distribute services to each server suitably. On the other hand, the network services have become increasingly diversified, complicated, and advanced, and the frequency at which changes are made to the configuration of a group of servers and the service distribution method has increased. The demand also has increased for circumventing terminates of some services due to some servers going down unexpectedly. Existing techniques of distributing services to multiple servers include Round-robin Domain Name Server (DNS), load distribution hardware, and an agent.
In the Round-robin DNS service, an entry table is set up in which multiple-server Internet Protocol (IP) addresses are mapped to one domain name. When a client makes an inquiry about a server IP address, servers are allocated to the client on a round robin basis. According the entry table and the IP addresses of the allocated servers, servers are presented to the client to distribute services to multiple servers. However, in the Round-robin DNS service, services are distributed to servers equally or at simple rates and each server has to perform services allocated to itself irrespective of its capabilities and dynamic load conditions. This produces a difference in load condition between each server, resulting in reduced efficiency of the whole system. Further, in the event that a server has gone down and the configuration of the server group has to be modified, it is required to manually make such a change to the server group configuration to delete a server that went down from the entry table. This change is made each time a server goes down. It is therefore difficult to cope with such a situation immediately. As a result, the whole system may have to be stopped temporarily.
In using load distribution hardware, a hardware device is placed between a server group and a network to relay communications between clients and servers. Load measuring communications are made between the hardware device and each server. Packets to be relayed are monitored to measure the number of connections to each server and its response time, thereby detecting the load condition of each server and distributing services to the servers accordingly. However, the hardware has high implementation costs. The employment of this system is limited because the hardware is not incorporated into each server. In addition, since communications for load measurement are needed between each server, extra load, which is different from original communications, is imposed on each server, which further increases traffic and may cause servers to go down. Furthermore, since the load is measured on a packet-by-packet basis, the servers may be switched even in mid-service causing errors to occur.
An agent residing on each server in a server group measures a load on its central processing unit (CPU) and its disk utilization to see its load condition. The load distribution system is notified of the load condition of each server and distributes services to the servers accordingly. However, since the agent function resides on each server, the server has to be modified at the time the agent is installed. The agent is also compatible with the server's operating system (OS). The load measurement is made for each server, resulting in an increase in the load on the server. Since the load is measured on a packet-by-packet basis, the servers may be switched even in mid-service causing errors to occur, as with the hardware device.
Draining a server involves gradually clearing the processing of the users requests on the server for service maintenance. Terminating the processing of the users requests on the server interrupts user applications. Draining a server in a user-based, load-balancing environment can cause existing business sessions to be interrupted. Interrupted users may have to login again and re-start business sessions, which can lead to the loss of the data which has been previously entered.
Servers may be drained by stopping servers from accepting new HTTP connections, while the servers continue processing the requests from existing HTTP connections for a predetermined time, or by removing a server from a content switch rule that starts a business session.
In the first method, a server is stopped from accepting any new HTTP connections for a predetermined time, such as twenty minutes, which is a default time-out for session cookie. After the predetermined period expires, the server is suspended for the services. For example, Microsoft® Application Center 2000 uses this method. This method works for stateless web applications that can execute properly regardless of the application state on the server. However, the first method may cause a user application to be interrupted under any one of the following three circumstances. In a first circumstance, a stateful business session running on the server might be interrupted, because the existing HTTP connection for the session might be closed before the session ends. The server load or other external factors (e.g., server is configured not to use persistent HTTP connection) may cause the session to be closed before the session ends. In a second circumstance, a user of an application uses one Microsoft® active server page (ASP) session to create application specific stateful sessions. Since an application session might be created during an ASP session and it might run longer than the ASP session timeout, the HTTP connection might be closed before the application business session ends. In a third circumstance, a client device or a load-balancing device forces a close of the HTTP connection before the business session ends. This occurs when the load-balancing device is configured to check certain HTTP requests.
In the second method, a system/network administrator accesses a content service switch device and modifies its rules of operation to remove a server from the content switch rule that starts the business session. The removed server does not start any new business session. Typically, the group of people maintaining the content switch is different from the group maintaining the application servers. Therefore, this second method requires modification of the content switch and may cause coordination problems among different maintenance groups.
It would be desirable to have a system drain servers that have stateful data before removing them from a service pool. Accordingly, there is a need for a system enabling server progressive workload reduction to support server maintenance that overcomes these and other disadvantages of the prior systems.