Data logs and data logging systems are widely used in the computing industry. Data logging is a process of automatically recording certain events in order to provide an audit trail of the activities in a specific part of the computer system. Logging is also one of the oldest and still one of the most widely used techniques for troubleshooting computer systems.
For example, a logging system designed to log and analyze network traffic captured by a so-called network sniffer may show important statistics and contents for each network packet captured by the sniffer.
A typical logging system is shown on FIG. 1. The logging system 10 characteristically comprises a data structure 11 that receives log records being logged from the external data source 9, and a log viewing facility 12 that visually presents logged data.
Actual programming implementation of the logging system 10 can take many shapes and forms. Components of the logging system may be implemented as independent software modules or may be combined with each other. However, the logging system 10 is implemented with hardware. For example, the data structure 11 can physically reside in a computer RAM or be implemented as a file on a disk. Log viewing facility 12 may comprise a simple terminal output or be presented in a scrollable program window or multiple windows.
Regardless of how the log viewing facility 12 is organized, the representation of the log records typically comprises a sequence of visual data lines. Each visual data line may consist of textual and graphical information. Combined together, all visual data lines form what will herein be referred to as a log view. The log view should not be confused with the log viewing facility. The former one is a hypothetical concept that represents all visual data lines corresponding to all existing log records, the latter—technical means for viewing a portion of the log view. As a logical construct, the log view may not have an actual physical implementation or image in the computer memory. Instead, as the log viewing facility “slides” along the log view, the logging system may be reconstructing visual data lines that become visible.
Certain logging systems must deal with heterogeneous log records. For example, in the logging system for network traffic, it may be necessary to log textual messages such as “TCP session established” and “TCP session terminated”, as well as log actual contents of the data being exchanged in the cause of the TCP session. Textual messages and actual data are, in fact, log records of two different types and this makes the network traffic logging system a heterogeneous logging system.
Requirements for the visual representation of different types of log records in a heterogeneous logging system may differ depending on the record type. For example, a simple message record such as “TCP session established” may be presented as text string occupying a single visual data line, while a record containing a portion of an actual binary data exchanged in this TCP session may need to occupy several visual data lines and contain both the hexadecimal and ASCII representation of the binary data.
There are two main approaches to representing logged data in the data view facility 12 of heterogeneous logging system.
The first approach is to convert heterogeneous log records into a homogeneous textual data. As a result, the data view facility 12 only needs to display a simple and homogeneous text file. An example of such textual output is shown on FIG. 2. Textual output 20 consists of messages 21 and blocks of data 22. It can be said that the logging system has records of two types—messages 21, and data blocks 22. Records of both types can interleave each other. Typically, the data block 22 consists of at least two areas: the area 23 showing the hexadecimal (HEX) representation of data, and the area 24 showing the ASCII representation of data. Traditionally, each visual line of such outputs reflects the value of 16 bytes of data.
This traditional approach is easy in the implementation and has been used in many popular logging systems, such as the TCPDUMP program for capturing network traffic.
At the same time, this simple approach has a significant disadvantage. After all types of records have been converted into their textual representation, each record effectively loses its boundaries and identity, thus becoming a simple block of text. Therefore, it becomes impossible to perform certain operations pertaining only to specific record types.
For example, it may be impossible to perform a search for all occurrences of a substring in the data exchanged in the TCP session. Substring 25 may be split between two lines of text and this will make it impossible to find this substring occurrence.
The second popular approach preserves each log record intact. All log records continue their existence independently and the logging system can identify each record separately and perform operations pertaining only to records of specific types.
In the known systems, however, this approach leads to a significant limitation in the way the data is presented in the data viewing facility. In known systems, such as the WireShark network protocol analyzer, each log record in the data viewing facility is accorded a single visual data line regardless of the type and contents of this record.
This single-line representation works well for simple message records, but does not allow directly showing blocks of data which require multiple visual data lines. As a result, the actual contents of the records of certain types are displayed in a separate area of the data viewing facility, with the obvious result that only the contents of a single such record can be visible at any given time. An example of such output is shown on FIG. 3.
The output 30 consists of two areas 31 and 32. The area 31 displays a list of logged records. Contents of the log records of certain types are displayed directly within the area 31. F or other types of records, such as the records containing a block of data, the contents are displayed in a separate area 32. The area 32 always presents the contents of a single log record 33 selected in the main list shown in the area 31.
With the division into records and record structure preserved, the logging systems of the second type can easily perform operations pertaining only to specific record types. For example, it becomes possible to perform substring searches in binary data—an operation which is impossible in the logging systems using the first approach.
Significant disadvantage that will be immediately obvious to those skilled in the art is the impossibility of viewing the contents of more than one log record at any given time. One of the reasons for choosing this approach may be the difficulty of implementing efficient scrolling in the data viewing facility of the logging system that allocates a variable number of visual data lines for each log record in accordance with the type and the amount of data in each such record.