The present invention is generally directed to the creation of log files in data processing systems. More particularly, the present invention is directed to systems and methods which provide an increased level of message granularity during the storage of log messages. Even more particularly, the present invention is directed to systems and methods which prevent erasures of log messages which are important for the reconstruction of event chains which describe the course of events leading to system problems.
A process running on a data processing system, including but not limited to distributed or parallel processing systems, may produce a running log which provides details associated with various events which occur during the process. These processes produce event logs or activity history logs whose size cannot be determined beforehand. While it is the case that the processes that generate such logs generally fall into the category of non-interactive processes such as daemons, interactive processes are also capable of generating messages and event descriptions that are stored in a log file. These log files, or more commonly “logs,” are especially useful for postmortem debugging and problem analysis. Some long running processes, such as daemon processes such as those which are distributed over many nodes in a distributed data processing systems, may generate log files which are very long and the system is thus compelled to create large activity logs which require an appropriate mechanism for storage and later retrieval, if necessary. However, it is not desirable, and it is sometimes completely unacceptable, to produce log files of an unlimited or even indeterminately large size. Log files of uncontrollably large size are undesirable since they limit storage, inhibit performance and add to the administrative overhead and burden of data processing systems.
Some data processing applications solve the problem of log file size management through the use of techniques which limit the size of the log file. This may be accomplished in several ways. In a first approach the file may be restricted to a certain maximum size and entries made to it are made in a first-in-first-out manner (finite sized push down stack) when the maximum file size is reached. In a variant of this approach, early file entries are overwritten when the maximum file size is reached. In yet another approach to this problem, a rotating file structure is provided so that, if the log file reaches a certain limit, subsequent log entries (also referred to herein as “messages,”“log messages,”“message entries,” or “log message entries”) are written to a completely new file. For example, if the current log file exceeds the predetermined limit for log file size, the current log file is named as a backup file, and another log file is created with the current log file name. Yet another approach to this problem is simply to reduce the number of log entries that are generated. However, this approach defeats the very purpose of maintaining an accurate and detailed event history. Although such abbreviated files are more easily managed, their content is often significantly lacking in the details desired for report generating purposes. While all of these approaches to the problem provide some help in limiting the amount of storage utilized, there are still several problems that are not solved by any of these methods.
For example, when the log file is truncated and wrapped many times, it is very often not possible to track certain important event or activity entries. The “wrapping”approach is thus seen to be particularly disadvantageous if a problem occurs at a customer site or at a remote site and the lost log entries provide the key elements needed to determine solutions to an underlying problem. In such circumstances, this approach clearly demonstrates that it has major drawbacks.
Another significant disadvantage that exists for conventional logging approaches is that they do not provide any granularity based upon the absolute or even relative importance of the log entries. Certain events or activity log entries may be more important than other entries. These log entries, as created by the running application or process, tend to be especially important for after-the-fact debugging and/or analysis.