Software tracing provides developers with data logs of useful information for program debugging, development, and maintenance. Data logs resulting from software tracing, or debugging, are used both during the development cycle and after the software is released. Because software tracing is low-level, there are often many types of messages written to the data log. The information written to the data log represents the developers' commentary as the application is running. The information in the data log can represent entry or exit messages, variable values, unusual events that occurred, or error conditions that should not occur when the program is operating properly. The unstructured developer created messages included in a data log often represent the positive responses, negative opinions, emotions, and other developer evaluations of software execution. Because software tracing is performed at a low level, data logs can be quite large in size, making traditional analysis of the data logs to glean sentiment data difficult or impossible.
With the increased usage of computing networks, such as the Internet, humans are currently inundated and overwhelmed with the amount of information available to them from various structured and unstructured sources, such as the information presented in a traditional data log. However, information gaps abound as users try to piece together what they can find that they believe to be relevant during evaluation of information on various subjects, such as when analyzing entries in a traditional data log. To assist with such evaluations, recent research has been directed to generating knowledge management systems which may take an input, analyze it, and return results indicative of the most probable results to the input. Knowledge management systems provide automated mechanisms for searching through a knowledge base with a large set of sources of content, e.g., electronic documents, and analyzing them with regard to an input to determine a result and a confidence measure as to how accurate the result is in relation to the input.
One such knowledge management system is the Watson™ system available from International Business Machines (IBM) Corporation of Armonk, N.Y. The Watson™ system is an application of advanced natural language processing, information retrieval, knowledge representation and reasoning, and machine learning technologies to the field of open domain question answering. The Watson™ system is built on IBM's DeepQA™ technology used for hypothesis generation, massive evidence gathering, analysis, and scoring. DeepQA™ takes an input question, analyzes it, decomposes the question into constituent parts, generates one or more hypothesis based on the decomposed question and results of a primary search of answer sources, performs hypothesis and evidence scoring based on a retrieval of evidence from evidence sources, performs synthesis of the one or more hypothesis, and based on trained models, performs a final merging and ranking to output an answer to the input question along with a confidence measure.