Various embodiments of the present invention relate to the management and maintenance of a computer application, and more specifically, to a method and apparatus for predicting anomalies and incidents in a computer application.
With the development of computer hardware and software technology, various computer applications are able to support people's production and life in various aspects. However, as functions of computer applications get increasingly complex and the number and kinds thereof increase continuously, many problems might occur when managing and maintaining computer applications. For example, multiple computer applications that are independent of each other or have a dependence relationship (e.g., call relationship) among them might be running on one or more physical devices. How to ensure these computer applications to be in a healthy running state now becomes a focus of attention.
A common solution is as below: when incidents occur in a computer application, only after a given time interval users of the computer application discover that the computer application has problems, and then these users may report the problems to a provider of the computer application by telephone, email or other means.
Nowadays it is found that anomalies in traffic metrics associated with a computer application might have some association with incidents of the computer application. For example, constant interruptions and re-connections of a network connection might indicate network adapter incidents. Therefore, a problem exists with respect to how to predict potential anomalies in future and further discover causes behind anomalies (e.g., network adapter incidents, etc.). However, currently there lacks a method capable of conveniently and accurately predicting an anomaly and incident in a computer application.
Note typically users of a computer application submit an incident ticket upon discovering a traffic anomaly, but an association relationship between the anomaly and incident is unclear. For example, users' feedback often has a lag; for another example, daily maintenance operations performed by the provider of a computer application (e.g., updating or upgrading application packages, etc.) might cause some traffic anomalies (however, these anomalies confronted with users are not caused by incidents); still for another example, some errors in manual operations also might lead to a confusing an association relationship between anomalies and incidents. Therefore, how to predict an anomaly and incident in a computer application becomes a focus of attention.