Managed security services providers (“MSSP”) generally provide real-time monitoring of networks and system infrastructure of their customers, e.g., network hardware and applications, to proactively search for and address potential security threats; and typically log or otherwise accumulate activity data on such networks and infrastructure. A single MSSP may track and log activity for thousands of individual clients, and in doing so, MSSPs may ingest logs from hundreds of different device types, including hundreds of thousands of user devices, thereby generating and/or receiving potentially billions of activity logs each day. As these logs come in, it is necessary that they quickly and efficiently be normalized to a machine understandable format, and analyzed to detect/determine possible security issues. Different clients serviced by a MSSP, however, may format their data or network information in a variety of different ways, e.g., with different syntax, formats, etc., and further may utilize a number of different type devices running a variety of software programs and/or variations/versions thereof. As a result, though the received logs may contain substantially the same information, it can be received in a wide ranging variety of different formats, many of which often may not be easily translatable.
Accordingly, accumulated logs typically have to be normalized, adapted, or curated for security information event monitoring (“SIEM”), for example, by parsing or other suitable syntax or syntactic analysis that employs rule-based systems designed to normalize unstructured/raw log data. Depending on the number of customers, devices, systems, etc., being monitored, this can result in thousands of parsing scripts or schema being generated to normalize various differing log formats. The normalization, adaptation, and/or rule curation is a heavily data intensive and very time critical process, and MSSPs generally have to operate large data engineering teams for generating such parsing scripts and managing the tens of thousands of parsing scripts enabled for the process of normalization, adaptation, or curation of the incoming logs on their monitoring platform(s). Such teams further must take on extensive work to create new scripts and/or amend existing scripts when log formats are changed or when the MSSP takes on new clients. For example, when software patches or infrastructures changes, log values and formats also can change, which has the potential to create problems for existing rule based normalization methods or existing parsing scripts, requiring such scripts to be changed/updated and/or additional parsing scripts to be generated. The effect of a change or update also can have an effect on security monitoring, for example, leaving clients vulnerable to security attacks during the time that scripts must be updated, particularly when clients do not inform the MSSP that such changes/updates have occurred, since existing parsing may not be able to properly format new, unstructured data.
Accordingly, it can be seen that a need exists for more efficient ways to manage data accumulation and adaptively monitor incoming security log data. The present disclosure addresses these and other related and unrelated problems in the art.