Modern communication systems involve a delicate interplay of network components running multiple services and multiple processes within each service. These systems are vital to business operations, such that downtime can impose significant costs to the business. The impact of network failures (even very minor ones lasting only minutes) can be measured in thousands or even millions of dollars. A vast amount of information is logged by these systems so that, in the event of a failure, the cause of the failure can be determined and corrected. However, troubleshooting log files to determine the cause is often a complex process, involving searching through large amounts of information for the specific transactions that failed at a specific point in time. The difficulty of troubleshooting using the log files is multiplied as services move to cloud computing models such that log information is spread across multiple different log files in multiple different locations. Further, the log files can contain valuable information regarding the outcome of certain transactions that did not result in a traditional error, yet still did not successfully complete the requested transaction. However, the valuable information is often deleted without being analyzed when the log files are no longer needed.
Based on the foregoing, there is a need for an approach to determine valuable information from log files and to provide faster troubleshooting.