This invention relates to an identification apparatus, identification method, and identification program for identifying candidates for the cause of a performance failure in a computer system.
Performance management of a computer system is one of important roles of an operation department. The operation department (A) detects performance deterioration in a service that the computer system provides to end users, and (B) identifies and deals with the cause of the deterioration.
Technologies with which (A) is accomplished include end user monitoring. This technology involves monitoring processing requests that are issued by end users to the computer system and detecting a delay in the fulfillment of the processing requests. In the case where the computer system is a Web system, packets of HTTP requests sent from end users to the computer system are monitored, and a response time is calculated from a time difference between a processing request packet and its response packet in order to detect whether or not there is a delay.
The work of identifying the cause of a performance failure in (B) is in most cases performed on databases which are particularly prone to a performance failure. The computer system's operation administrator monitors processing requests (SQL statements) sent to a database, and identifies an SQL statement that is suspected to be the cause of a performance failure from records about the execution time of the monitored SQL statements.
The HTTP request monitoring of (A) and the SQL statement monitoring of (B) are conducted separately and therefore cannot be associated with each other. Consequently, a delay in the processing of HTTP requests that is detected in (A) does not lead to efficient identification of an SQL statement that has caused the delay.
An analysis apparatus of JP 2012-198818 A, on the other hand, calculates the probability of a second pair existing between a request and a response that constitute a first pair to extract a second pair that is associated with a given first pair based on the calculated probability. In other words, this analysis apparatus extracts a second pair that corresponds to a given first pair based on probability, without using a model that defines associated pieces of information in advance. Accordingly, when a change in system specifications or the like creates a new association between a first pair and a second pair, the analysis apparatus associates the new first pair and second pair with each other.
In JP 2012-198818 A, the probability of an HTTP request and an SQL statement that are observed independently of each other being observed in the same period (i.e., correlation) is calculated in this manner. When the probability of a particular SQL statement appearing in the processing period of an HTTP request is high, a correlation between the HTTP request and the SQL statement is acknowledged.
Records of HTTP requests and SQL statements, which are processed at a rate of several tens to several hundreds per second in some cases, amount to a huge size. Calculating the probability for every possible pairing of an HTTP request and an SQL statement therefore requires reading a great amount of data and a tremendous amount of calculation. A problem of executing this calculation in a realistic time frame is an increase in calculation load. The problem is not limited to the relation between an HTTP request and an SQL statement, and arises between any pieces of data that are observed independently of one another.