1. Field
The present invention relates to a method and apparatus for recording performance parameters from a computer system.
2. Related Art
As electronic commerce becomes more prevalent, businesses are increasingly relying on enterprise computing systems to process ever-larger volumes of electronic transactions. A failure in one of these enterprise computing systems can be disastrous, potentially resulting in millions of dollars of lost business. More importantly, a failure can seriously undermine consumer confidence in a business, making customers less likely to purchase goods and services from the business. Hence, it is important to ensure high availability in such enterprise computing systems.
To achieve high availability, it is necessary to be able to capture unambiguous diagnostic information that can quickly locate faults in hardware or software. If systems perform too little event monitoring, when a problem crops up at a customer site, service engineers may be unable to quickly identify the source of the problem. This can lead to increased down time.
Fortunately, high-end computer servers are now equipped with a large number of sensors that measure physical performance parameters such as temperature, voltage, current, vibration, and acoustics. Software-based monitoring mechanisms also monitor software-related performance parameters, such as processor load, memory and cache usage, system throughput, queue lengths, I/O traffic, and quality of service. Typically, special software analyzes the collected performance data and issues alerts when there is an anomaly. In addition, it is important to archive historical performance data to allow long-term monitoring and to facilitate detection of slow system degradation.
One challenge in archiving historical performance data is that a computer typically has limited storage space. As time progresses, cumulatively storing real-time performance data will eventually fill up the assigned storage space. One way to resolve this problem is to use a buffer, where the oldest stored performance data is discarded to make room for newly collected data. However, this approach only maintains a historical archive of the last x days. Performance data from more than x days ago is permanently lost. It is therefore difficult to know how a system performed more than x days ago. Furthermore, it is difficult to maintain a historical archive of past performance data while efficiently using the allocated storage space.
Hence, what is needed is a method and an apparatus for recording performance parameters without the problems described above.