Application servers, such as Java 2 Enterprise Edition (J2EE) servers, often fail due to problems caused by unexpected increases in workload, new or unexpected usage patterns, or changes in the applications in the server itself. Often the server failures occur because the application server is improperly tuned for these various conditions.
The majority of application server products currently available on the market handle a failure by using an automated restart in response to the failed instance. Restarting the instance with the same tuning values as previously set often results in the same problems recurring, and then another failure, or string of failures. Sometimes this cycle of failing and restarting can result in a “thrash” condition. Often in a server farm environment these failures are exacerbated as they create a cascading effect throughout the entire server farm—one application server fails, thereby increasing workload on the remaining servers. Those remaining servers in turn begin failing due to the increased volume of traffic they receive until the first instance is restarted. Thus, the failure of an instance(s) and the concomitant restart of server(s) is not adequately addressed, nor is there any type of “learning” from the possible patterns of failed instances and restarts.
In view of the foregoing, a need exists to overcome one or more of the deficiencies in the related art.