Locating problems, or "bugs" in software can be very time-consuming and expensive because of the inherent complexity of software. To reduce the time required to "debug" software, complex software systems frequently include the ability to continuously or intermittently create diagnostic logging information that provides information about the internal processes of the software. Such diagnostic information is typically written to a file where it can be analyzed with the appropriate tools to help in determining what caused the software to malfunction. Such a tool typically retrieves the logging information from a log file, formats the logging information, and displays the logging information on a computer monitor where it can be analyzed by a software technician. Because a large amount of diagnostic information can be generated, the software technician typically requests diagnostic information which was logged during the time range in which the problem occurred, and the analyzing tool extracts from the log file those log records that fall within the requested time range.
One category of complex software systems relates to distributed systems which operate in a network environment. Distributed systems typically have tasks that execute simultaneously on different network servers. Some distributed systems are referred to as disconnectable distributed systems, and include, for example, electronic mail systems, distributed directory services, management services, replicated file systems, and replicated databases. The term "disconnectable" is used to indicate that the communication paths between the various distributed tasks may be very slow, or may even be disconnected. This can occur because the communications links between servers may be relatively slow and/or unreliable, or because the software itself is inoperable at a particular time.
Some distributed systems include the ability to generate diagnostic logging information which can be used to debug software problems. Typically each task in the distributed system maintains one or more log files on the network server on which it runs. However, the log files are typically analyzed on a local computer, which may be connected to the network server via a relatively slow communications path, where the technical staff is located. To access the diagnostic logging information from the local computer, the diagnostic information is transferred from the network server to the local computer. Frequently the transfer of log file information over relatively slow Wide Area Network (WAN) communication lines can lead to significant delay in receiving the log file information. Moreover, conventional log file analysis tools typically must interact with the distributed system on the remote server to acquire the diagnostic information. No analysis is possible if the distributed system is unavailable.
Consequently, analyzing log files generated by a distributed system can be frustrated by long data transfer delays, and may even be temporarily impossible if the distributed system is unavailable. Moreover, it is common to access the diagnostic information in log files repeatedly when analyzing software problems. In conventional logging systems, such diagnostic information must be transferred over the network to the computer on which the analysis tool is running each time the information is requested.
It is apparent that a diagnostic logging system which enables access to the diagnostic logging information regardless of whether the distributed system is available, and which eliminates the need to repetitively transfer the same diagnostic information over relatively slow WAN links would be desirable.