The present invention relates generally to the field of data stream analysis, and more particularly to extracting relevant data from an incoming log stream.
Security log streams come in all shapes and sizes. The structure of the security log streams and the information contained within each stream varies vastly from device to device. Due to the nature of the log streams, a method to parse out desired fields may make use of regular expressions (regexes). Although using regexes is more efficient than other string comparison methods, it still has downsides and may not be quick enough to perform the kind of real time analysis of the security log streams.
In regular expression based parsing methods, the user manually crafts regular expressions to match different pieces of useful information in various kinds of security log streams. When a log format changes or new log types are added, the user will have to modify or update the existing regular expressions, or add new code containing new regular expressions to support the changes and additions. Moreover, the efficiency of parsing may be dependent on the technical skills of the user crafting the regular expressions. Correlating and understanding the messages contained within security log streams is crucial to any network's security.
The aforementioned correlation should be done in real time else it defeats the purpose of analyzing security threats and vulnerabilities and reducing the potential harm caused by them to the company. This real time swift approach, while trying not to compromise on the accuracy of risk detection, needs to be able to cater to log streams generated from majority, if not all, of the devices present in the network.