1. Technical Field
This invention generally relates to networked computer systems, and more specifically relates mechanisms and methods for dealing with failure of a server in a networked computer system.
2. Background Art
The widespread proliferation of computers prompted the development of computer networks that allow computers to communicate with each other. With the introduction of the personal computer (PC), computing became accessible to large numbers of people. Networks for personal computers were developed that allow computers to communicate with each other.
Computer networks allow computer systems or programs known as “clients” to request information or services from other computer system or programs known as “servers”. Different types of servers are know in the art. For example, a web server delivers, or “serves”, a web page to a requesting client. An application server hosts software applications that may be invoked by client computer systems or programs. A database server delivers data in response to database requests (or queries) to a database. Note that these labels of “web server”, “application server” and “database server” are used in the art to describe a specific function for a server, but these functions are not mutually exclusive. Thus, a single server could perform the functions of a web server, an application server, and a database server.
Often servers need high-availability, meaning that multiple servers are provided, and a failure in one server causes fail-over procedures to be followed to continue processing notwithstanding the failure. In such a scenario, a load balancer is typically used to distribute work to each of the servers. When a server goes down, the load balancer detects the failure and attempts to compensate by routing all of the requests to the remaining, non-failed servers. However, the remaining servers may not be able to handle the additional workload caused by the failure. As a result, the entire system slows down, potentially providing performance that is too slow to meet designed objectives or too slow to provide adequate customer response. Without a way for allowing server computer systems to better compensate for a failure of one of the servers, the computer industry will continue to suffer from reduced and potentially unacceptable performance when a server fails.