In a recent well-known operation maintenance system for an IT system, for example, failure cases including the symptoms and causes of the failures are managed and used for the search for the cause corresponding to a symptom, and possible causes are presented as the search results. In addition, ITIL (IT Infrastructure Library) produced by the UK Office of Government and Commerce in late 1980 is recently being used in the technical field of operation and maintenance. With the ITIL, there is a demand to reduce the time from the occurrence of a failure until its restoration so that improvement in service quality and reduction in cost are achieved.
A conventional failure case search system will be next described. FIG. 18 is a diagram illustrating an exemplary conventional failure case search system. A failure case search system 300 illustrated in FIG. 18 is connected to a database 301 in which failure cases including their related symptoms and causes have been registered. An administrator accesses the failure case search system 300 using a terminal 302 and inputs a keyword for the phenomenon of a current failure. The failure case search system 300 searches the database 301 for the cause of the failure case corresponding to the keyword and then displays the search results as a list of possible causes on the display screen of the terminal 302.
The administrator consults the list of the possible causes on the display screen to recognize the cause of a failure case similar to the current failure case and can thereby rapidly recover the failure according to the recognized cause.
However, in the conventional failure case search system 300, the administrator, for example, creates the failure cases in a free format. Therefore, if the keyword used for the search for a failure case is too peculiar, the number of hits for similar failure cases is significantly low, so that these failure cases cannot be narrowed down. In addition, in the conventional failure case search system 300, if the keyword used for the search for a failure case is too general, the number of hits for similar failure cases is large, so that these failure cases cannot be narrowed down. Therefore, in the failure case search system 300, since the possible failure cases cannot be narrowed down, the cause of the current failure case cannot be identified, and the recovery of the failure takes a long time.
In one known failure cause estimation device, the causes of failure cases similar to a currently processed failure case are estimated using a decision tree created from reports describing the examination results for past failure cases without searching the failure cases themselves, and the estimated causes are presented.
FIG. 19 is a diagram illustrating the operation in a failure case registration stage in the above conventional failure cause estimation device, and FIG. 20 is a diagram illustrating an example of the failure case used in the conventional failure cause estimation device. In FIG. 19, upon reception of input of a report including the examination results for a failure case from the user such as the administrator, a report registration unit 401 of the failure cause estimation device registers the report in a failure case database (hereinafter referred simply as DB) 402 as a failure case. An ID 410A for identifying the failure case, a question 410B about the symptom etc., a reply 410 (for example, the cause) to the question 410B, exchange details 410D including the examination results for the question-reply exchange, and other information are described in the failure case 410.
FIG. 21 is a diagram illustrating the operation in a failure case learning stage in the conventional failure cause estimation device, and FIG. 22 is a diagram illustrating a series of operations until a decision tree is created in the failure case learning stage. A morphological analysis unit 403 of the failure cause estimation device divides text contained in each of failure cases 410 registered in the failure case DB 402 into word units. Then an important word extraction unit 404 extracts important words from the words in the text divided by the morphological analysis unit 403 using, for example, TF-IDF (Term Frequency, Inverse Document Frequency) or a dictionary for collected information.
Then a decision tree construction unit 405 constructs, using the important words extracted by the important word extraction unit 404 and the causes of the failure cases, a decision tree 405A (see FIG. 22) in which the failure cases are organized in an optimal manner. In the decision tree 405A, higher accuracy can be achieved when the collected information described in each failure case is used as branch conditions. However, the dictionary for collected information is not always provided. Therefore, when TF-IDF instead of the dictionary is used for extraction, the important words are extracted using a threshold value set for the rareness of each word, so that words in the collected information are not necessarily extracted. The collected information is, for example, an error message log, such as a system log or a software log, acquired upon request, a definition file for environment setting for operating software, or a dump outputted when abnormal termination occurs. Then the decision tree construction unit 405 registers the constructed decision tree 405A in a decision tree DB 406. The decision tree construction unit 405 automatically creates the decision tree 405A at predetermined timing.
FIG. 23 is a diagram illustrating the operation in a cause estimation stage in the conventional failure cause estimation device, and FIG. 24 is a diagram illustrating a series of operations in the cause estimation stage until the cause is estimated. When no description of cause 410C is given in the failure case 410 as illustrated in FIG. 24, the failure cause estimation device presents the cause of a similar failure case to the user.
Upon reception of input of a failure case the cause of which is to be estimated, the morphological analysis unit 403 of the failure cause estimation device divides the text of the failure case into word units. Then the important word extraction unit 404 extracts important words from the words in the text divided by the morphological analysis unit 403 using, for example, TF-IDF. A cause estimation unit 407 uses the important words extracted by the important word extraction unit 404 as branch conditions to estimate the cause of the failure case from the decision tree 405A according to the truth or falsity of the branch conditions.
For example, since the important words in the currently processed failure case include branch conditions “web” and “slow”, the cause estimation unit 407 estimates the cause as “real time scan by virus scanning software” from the decision tree 405A. Then the failure cause estimation device presents, as a candidate of the cause, the estimation result from the cause estimation unit 407 on a display screen.
In the above conventional failure cause estimation device, the collected information is in a predetermined format such as an error message format. Therefore, it is not necessary to take notational variants into consideration, and the cause of a failure case can be estimated from the decision tree even when the collected information is used for branch conditions. However, in the above conventional failure cause estimation device, notational variants can occur if questions, replies, etc. that cannot be obtained from collected information and are obtained from, for example, interviews are used, because general words are used in such questions and replies. When these questions and replies are used as branch conditions, it is difficult to estimate the cause of the failure case from the decision tree. Therefore, even when questions and replies are obtained from interviews, collected information must be obtained later, and the time from the estimation of the cause until the recovery of the failure is long.    Patent Document 1: Japanese Laid-open Patent Publication No. 2005-251091    Patent Document 2: Japanese Laid-open Patent Publication No. 2007-323558    Non-Patent Document 1: Yixin Diao, Hani Jamjoom, David Loewenstern, “Rule-Based Problem Classification in IT Service Management”, cloud, pp. 221 to 228, 2009 IEEE International Conference on Cloud Computing, 2009