Field
The disclosed embodiments relate to techniques for monitoring and analyzing computer systems. More specifically, the disclosed embodiments relate to techniques for performing intelligent inter-process communication latency surveillance and prognostics.
Related Art
As electronic commerce becomes more prevalent, businesses are increasingly relying on enterprise computing systems to process ever-larger volumes of electronic transactions. A failure in one of these enterprise computing systems can be disastrous, potentially resulting in millions of dollars of lost business. More importantly, a failure can seriously undermine consumer confidence in a business, making customers less likely to purchase goods and services from the business. Hence, it is important to ensure reliability and/or high availability in such enterprise computing systems.
Not all failures in computer systems are caused by hardware issues. Instead, software aging in enterprise computing systems may result in problems such as hangs, crashes, and reduced performance. Such software aging may be caused by resource contention, memory leaks, accumulation of round-off errors, latching in shared memory pools, and/or other sources of software performance degradation. To manage software aging in complex enterprise computing systems, a multivariate pattern-recognition technique may be applied to performance parameters collected from the enterprise computing systems to trigger software rejuvenation in the enterprise computing systems when software aging is detected. Such proactive prediction and management of software aging is described in U.S. Pat. No. 7,100,079 (issued 29 Aug. 2006), by inventors Kenny C. Gross and Kishore S. Trivedi, entitled “Method and Apparatus for Using Pattern Recognition to Trigger Software Rejuvenation.”