1. Field of the Invention
The present invention generally relates to a system for diagnosing program failure, and, in particular, to a hierarchical categorization of customer error reports.
2. Description of the Related Art
Software programs often fail by “crashing” or reaching error conditions that cause them to terminate. In order to improve product quality, it is important to diagnose the reasons for failure.
Operating systems often generate crash data for software programs, wherein the crash data can be analyzed in an attempt to diagnose the reasons for failure. For example, MICROSOFT WINDOWS operating systems create a “full dump” or “minidump” file, and UNIX or LINUX operating systems create a “core dump” file, when a program terminates due to unhandled error conditions.
It is well known for software program vendors to provide users with a set of tools for capturing and analyzing program crash data. In their simplest form, these tools comprise an error reporting mechanism that presents the users with an alert message that notifies them when a failure occurs and provides an opportunity to forward crash data, known as a Customer Error Report (CER), to the vendor for further analysis. The vendor can then use the forwarded crash data to troubleshoot problems, ultimately leading to more robust and crash-resistant programs.
Part of the data collected about a particular crash is the application's stack trace, which is comprised of a sequential ordering of modules, objects, functions and offsets, starting from the operating system, and extending to an offset into a function of an object of a module of the application where the failure occurred. This can correspond directly to a line number in a source code file, if all necessary information is available for that crash data.
Often, a large number of CERs are collected by a vendor. To expedite the prioritization of resources in resolving the CERs, vendors usually sort the stack traces by the top line of the stack, which indicates the module, object, function and offset at which the failure occurred. This means that two or more CERs from two or more different customers that have the same top level modules, objects, functions and offsets would be categorized as belonging to the same group or “bucket” of failures (the process for sorting CERs is also referred to as a “bucketing algorithm”).
An unfortunate side effect of this bucketing algorithm is that two CERs that are generated from two different failures may be categorized as belonging to the same bucket because they had identical top lines of their stack traces. This was usually the symptom of different failures being directed into the same error handler. The error handler would display itself on the top line of the stack and the true source of the failure would be hidden in lower lines or entries of the stack trace.
Consequently, there is a need in the art for a mechanism where the information to help diagnose failures can be intelligently supplied from lower lines or entries of the stack trace. Specifically, there is a need in the art for a sub-bucketing algorithm that uses additional information from the stack trace to help identify the failures associated with customer error reports.