A computing environment includes one or more running processes. The processes can all be running on a single computing system, or, the processes can be spread across multiple computing systems in a networked-based or cloud-based computing environment. Each process in the computing environment can write diagnostic information to a log file. The log file of each process is private to the process that writes to the log file. The log file of a process is not accessible by other user-based processes in the computing environment.
Each process in a computing environment typically includes multiple libraries and multiple frameworks that each generate one or more threads of execution that each write diagnostic information to their log file for that process. In addition, each thread in a first process can request a service from a second process on the same computing system or on another computing system in the computing environment. The second process can generate multiple threads that each write diagnostic information to the log file of the second process. Thus, within the log file for each process, a substantial number of different threads are writing diagnostic information for the process and that information may be related to other processes.
Diagnostic information can include a wide variety of information about the state of threads that are executing within a process. This information is typically written to the log file as a free-form text string. Writing a text string message to a log is computationally expensive compared to the execution of the thread for which the log entry is written. In addition, most threads complete successfully, without a failure. Therefore, substantial computing resources are consumed in generating and storing a large amount of diagnostic information that is not related to a system failure and is of little, or no, assistance to a technician in determining a cause of a failure.
In the event of a failure of a thread in a process, a technician collects all of the logs from all of the processes within the computing environment to determine the root cause of the failure, even logs of processes that are not involved in the activity of the thread that failed.