In computer log management and intelligence, log analysis (or system and network log analysis) is a set of techniques for providing insight into computer-generated records (also called log or audit trail records). The process of creating such records is called data logging. People do log analysis for reasons including the following: (i) compliance with security policies; (ii) compliance with audits; (iii) system troubleshooting; (iv) forensics; and/or (v) security incident response. Logs are emitted by network devices, operating systems, applications and other programmable devices. A stream of messages in time-sequence may constitute a log. Logs are sometimes directed to files and stored on disk, or directed as a “network stream” to a log collector. Log messages: (i) are interpreted with respect to the internal state of its source (for example, an application); and (ii) announce security-relevant or operations-relevant events (for example, a systems error). Logs are often created by software developers to aid in the debugging of the operation of a software application. Log analysis interprets messages in the context of an application, vendor, system or configuration (each of which may have its own form and format for generating logs) in order to make useful comparisons among and between messages from different log sources.
Some logging related functions include: (i) pattern recognition is a function of selecting incoming messages and comparing them with a pattern book in order to filter or handle in different ways; (ii) normalization is the function of converting message parts to the same format (e.g. common date format or normalized IP address); (iii) classification and tagging is ordering messages in different classes or tagging them with different keywords for later usage (e.g. filtering or display); and (iv) correlation analysis is a technology of collecting messages from different systems and finding all the messages that belong to one single event (e.g. messages generated by malicious activity on different systems: network devices, firewalls, servers, etc.).
Debugging logs are created by many components of an enterprise including: (i) systems; (ii) switches; (iii) networks; and (iv) software. In a typical log analysis environment, the logs emitted from each of the components of the enterprise are synced to a central search cluster. This collection of debugging logs is performed by programs called “sensors.” Sensors collect machine data from certain enterprise components. Collectors are deployed by an enterprise to collect, aggregate, augment, and/or distribute machine data, including logs, mobile transactions, and/or other files from multiple sensors. A collector may delegate, or fail over, to another collector when it is unavailable. Cluster analysis (commonly called “clustering”) is the task of grouping a set of objects in such a way that objects in the same group (called a “cluster”) are more similar to each other than to those in other groups (clusters). A debugging search cluster is the group of objects associated with an abnormal system condition.
A collection framework manages the sensors to take actions based on the collected data from various sources. For example, a collection framework operates to sync collected data to a corresponding debugging search cluster.