Application servers may provide remote clients with access to applications executing on the application server via a communications network. An application server is often configured to execute multiple concurrent processes. For example, an application server may be configured to service a plurality of clients simultaneously, provide a plurality of services to a single client, or a combination of these. The ability to execute multiple concurrent processes can be achieved by employing a plurality of internal processing units, each of which may independently execute one or more server applications and independently manage traffic with an external network. The internal processing units may be physically distinct (e.g. separate microprocessors) and/or logically distinct (e.g. virtual machines).
It is desirable that the communications load of an application server is balanced among the internal processing units so that all of the resources available to the application server are used efficiently. As a new connection is created with a client, the connection may be allocated to an available internal processing unit using load balancing algorithms to ensure an approximate balance across all of the internal processing units.
When an application server is initialized (e.g., booted up) it is often the case that the internal processing units become operational at different times. Because each internal processing unit may be executing different types of processes, each unit may intrinsically require a different initialization period. Additionally, various hardware and software logistical considerations (e.g. bus queues, shared resources, etc.) may cause further differences between the times at which various of the internal processing units are ready to receive traffic.
Conventional methods for managing traffic to an application server while the server is initializing will start accepting connections once any internal processing unit is available. Ideally, this management technique would result in the least amount of delay between the system restart and when the application server can begin to accept traffic.
When the application server first becomes available (i.e., when the first internal processing unit is ready to receive traffic), many of the internal processing units may still be unavailable. Traffic that would otherwise be handled by the unavailable internal processing units is distributed over the internal processing units that are ready to receive traffic according to the application server's failover principles. This may result in an unbalanced situation (e.g., a situation where some internal processing units process many transactions in a given period of time while other units process few if any transaction during that same period of time). This uneven distribution can negatively affect system performance.
This situation is exacerbated because an application server will often restart in response to an overload situation. For example, when there is a lot of traffic, there is a risk that the system may exhaust its available memory or that latency may exceed acceptable limits. The system will attempt to recover by initiating a system restart. However, the external stimuli (e.g., traffic from clients) may still be too high when some, but not all, the internal processing units are ready to receive traffic. This high amount of traffic (which was high enough to overtax the entire application server) may be concentrated among the more limited resources of the internal processing units that are first ready to receive traffic after the system restart. Furthermore, the application server will also be executing whatever processing is necessary for restarting the system, including the internal processing units. This high demand situation is likely to cause another system restart.