Software applications are comprised of individual software components, each of which performs a set of behaviorally related functions. An example of software components is a software module that controls a GPS receiver of a mobile phone. Another example is a software module that controls the loudspeaker of a mobile device.
Software failures are common, and their impact can be very harmful. As a result, there is a growing need for automated tools for identifying and diagnosing software failures and isolating the faulty software components, such as classes and functions, which caused the failure. For example, failure of a navigation application may be caused by a bug in the GPS receiver software module or in the loudspeaker software module.
Software fault prediction algorithms use machine learning techniques to predict which software component is likely to contain faults. In contrast, software diagnosis algorithms use model-based or spectrum-based approaches to identify faulty software components that caused a failure.
One method of automated diagnosis is Model Based Diagnosis (MBD), which uses a model of the diagnosed system to infer possible explanations of the observed system failures. Traditionally, when diagnosing a software system, MBD algorithm receive as input: a) a formal description of the system, i.e. a logical model of the correct functionality of each component in the system, b) the set of components in the system that may be faulty, and c) a set of observed test executions, each test labeled “passed” or “failed”. While MBD has successfully been applied to a range of domains, it has not been applied successfully to software, because there is usually no formal model of a developed software application.
Spectrum-based Fault Localization (SFL) is another method of diagnosing software faults that does not require a logical model of the correct functionality of each software component in the system. SFL considers traces of software executions and diagnoses by considering the correlation between execution traces and failed executions.
Abreu et al. (Spectrum-based Multiple Fault Localization, Automated Software Engineering, 2009. ASE '09. 24th IEEE/ACM International Conference on, 2009) propose using a scalable software diagnosis algorithm called BARINEL (Bayesian AppRoach to dIagnose iNtErmittent fauLts) for diagnosing software faults. BARINEL utilizes a combination of MBD and SFL approaches. The key concept of BARINEL is to consider traces of tests with failed outcome as conflicts, and returning their hitting sets (an algorithm that is used to find diagnoses which explain all observed discrepancies, represented as conflicts) as diagnoses. BARINEL often outputs a large set of diagnoses, thus providing weak guidance to a programmer that is appointed to solve a bug observation. To address this problem, Abreu et al. proposed a Bayesian approach to compute a likelihood score for each diagnosis returned by the algorithm. Then, diagnoses are prioritized according to their likelihood scores.
For a given diagnosis, BARINEL computes the likelihood score by computing the posterior probability that it is correct, given observed test passes and fails. As a Bayesian approach, BARINEL also requires some assumption about the prior probability of each component to be faulty. Traditionally, the prior probabilities used by BARINEL represent the a-priori probability of a component to be faulty, without considering any observed system behavior.
It is therefore an object of the present invention to improve the accuracy of BARINEL software diagnosis.
Other objects and advantages of this invention will become apparent as the description proceeds.