Currently, many computer systems track the performance of processes and log performance data so that it can be viewed and analyzed by users. The most common technique for retrieving data from a log file is to read the data from the beginning of the log file to its end sequentially. However, this approach has some drawbacks. First, the user cannot obtain a summary view of the entire log file without waiting for the entire file to be read and summarized. Second, it is difficult to view data in a selected time range when the time range is located far from the beginning of the file because all of the preceding records must be read before the desired records can be located.
While these drawbacks are not significant when the file size is small, they become considerable as the size of the log file grows. Many log files cover many activities over a substantial period of time, so large log files that require analysis are quite common.
The prior art discloses approaches for obtaining summary data and efficiently accessing records in log files and databases. For example, U.S. Pat. No. 5,819,066 discloses, inter alia, benchmarking a database server by generating analysis reports from log information stored in transition log files and process log files. U.S. Pat. No. 6,114,967 to to Nock (the '967 patent) discloses generation of a custom log analysis framework encapsulating the common attributes needed by log analysis tools. Similarly, U.S. Pat. No. 6,493,699 to Colby et al. (the '699 patent) discloses defining and characterizing an analysis space for analysis on a user defined subset of detail data to reduce analysis time. U.S. Patent Application Publication 2003/0055809 to Bhat (the '809 publication) discloses configuring log files with header information to allow a logging service to directly access various locations of the log file. Furthermore, U.S. Patent Application Publication 2003/0220940 to Futoransky et al. (the '940 publication) discloses secure auditing of information systems that analyze audit log data. U.S. Pat. No. 5,961,598 to Sime (the '598 patent) discloses a system and method for internet gateway performance charting that displays selected performance charts based upon gathered statistics. U.S. Patent Application Publication 2002/0111887 to McFarlane et al. (the '887 application) discloses an employee online activity monitoring system that monitors employee online activity. The '066 patent, '967 patent, the '699 patent, the '809 publication, '940 publication, the '598 patent, and the '887 application disclose methods for obtaining summary data, but these approaches do not include generation of summary data inside a log during the logging process.
In addition to the patents and publication discussed above, U.S. Patent Application Publication 2002/0174136 to Cameron et al. (the '136 publication) discloses high-performance transaction processing using a relational data base. However, the '136 publication neither maintains summary data within the log file, nor improves the efficiency of retrieving data records in a non-sequential way. U.S. Pat. No. 6,789,115 to Singer et al. (the '115 patent) discloses a system that captures, analyzes, stores, and reports system users' usage of multiple internet and/or intranet web servers. However, the system disclosed in the '115 patent does not aid in efficiently retrieving data records in a non-sequential way, and also does not reduce the number of input/output operations for retrieving summary data and individual data records. Furthermore, U.S. Patent Application Publication 2002/0138762 to Horne (the '762 publication) discloses management of log archival and reporting for data network security systems. However, the '762 publication does not generate summary data during the logging process, and does not integrate the archival and analysis processes.
What is needed beyond the prior art is a method to generate summary data from a a log file and to locate data in a log file during the logging process so that data records are retrieved efficiently in a non-sequential way, the number of input/output operations for retrieving the summary data is reduced, and the archival process is integrated with the analysis process.