1. Technical Field
The present invention relates in general to the field of computers, and in particular to event logs in computers. Still more particularly, the present invention relates to a method and system for identifying temporal granularity of multiple event log streams to aid in the organization of an aggregate event log.
2. Description of the Related Art
In computing systems, a record of events (e.g., completion of an operation, an input/output operation, an error signal, a flag setting, a system crash, etc.) is generated and logged by a large number of independent hardware and software components. This record can be useful in analyzing or predicting system failures, particularly when combined into a single, chronological merged log. For example, a record showing an input (software event) from an unknown source immediately followed by a disk crash (hardware event) is a good indicator that the input from the unknown source caused the disk to crash.
In many instances, the precision of the clocks involved in generating the record of events varies greatly. For example, hardware counters may be accurate to the microsecond, while records of software events may only be accurate in the millisecond range. As a result of this varying precision, properly ordering events from different sources becomes impossible based upon clock information alone. As an example, consider a record of a software event compared to a record of a hardware event, as shown in FIG. 1. A log stream 102a is a record of software events, and a log stream 104a is a record of events from a specific piece of hardware. As shown, log stream 102a has a temporal granularity of 1.0 units of time (such as milliseconds), while log stream 104a has a temporal granularity of 0.1 units of time. Events A-E occur after time T1 and before time T2, although not necessarily at the places on the time line represented by log stream 102a. That is, events “A” and “B” are both marked as having occurred at time T1 and may actually occur at any time between time T1 and time T2. Furthermore, events A-E may or may not occur in the order shown, depending on the capability of the log generator that created the log stream 102.
As shown, log stream 104a has a temporal granularity of 0.1 units of time. Thus, it is certain from viewing log stream 104a that event “1” occurred before event “2,” which is in a time frame that is subsequent to the time frame in which event “1” occurred. Similarly, event “2” occurred before events “3” and “4.” Event “3” may or may not have occurred before event “4,” again depending on the capability of the log generator that created log stream 104a. 
Even though there is an ambiguity of when and in what order the events occurred on log stream 102a, the information shown in log stream 104a in FIG. 1a is useful, since events “1-5” are temporally ordered (with the possible ambiguity of events “3” and “4”). However, when creating an aggregate log of log stream 102a and log stream 104a, some type of common time epoch must be used. This commonality is typically obtained by placing all events within a lowest common temporal granularity. Thus, as shown in FIG. 1b, log stream 102a and log stream 104b have the same temporal granularity of 1.0 units of time. While the order of events “1-5” can still be assumed (except possibly for the order of events “3” and “4”), the information describing the temporal spacing of these events is lost. That is, it is no longer known whether some or all of the events occurred near time T1, time T2, or at a time sometime between times T1 and T2.
Alternatively, the events in log stream 102b can be assigned purely arbitrary time extensions to appear to give the same temporal granularity as that of log stream 104a, as shown in FIG. 1c. Thus, event “A” is given an arbitrary time of T1.1, which is likely not an accurate representation of when event “A” occurred, since events A-E could have occurred at any time between times T1 and T2. Similarly, every event in log stream 102b may be given arbitrary time extensions, which may be the same or different for each event.
Thus, in FIG. 1b, information is lost from log stream 104, and in FIG. 1c, potentially erroneous information is introduced into log stream 102.
Another alternative for merging log streams is to combine the two log streams into an aggregate log by using the less-accurate time division (e.g., that used in log stream 102) and feeding all events from both log stream 102 and log stream 104 into the aggregate log. However, like the method shown in FIG. 1b, the temporal order of events in log stream 104 is lost, and there is still no way to know the temporal order of the events from the two log streams.
What is needed, therefore, is a method and system for merging log streams of disparate temporal granularity into a stream having the least precise common time epoch while maintaining the temporal information about events from the more precise log stream. Preferably, the combined aggregate log should be able to be further refined to correctly order events that were previously ordered ambiguously.