A typical conventional STEM system is fully centralized. The centralized SIEM system collects raw log information from monitored remote applications of an enterprise environment, and uses the collected raw log information to build a comprehensive database of application activity. The system subsequently performs correlations on the data stored in the database to determine, for example, if specified patterns are found.
This conventional centralized approach has a number of significant drawbacks. For example, collecting and indexing raw log information in a centralized location increases latency. Also, many of the desired correlations may not require the complete environment context, and so working in a single large centralized database slows the system performance, as much of the data collected and indexed is not relevant to a particular query. In addition, since there are usually many queries to be correlated, and these are sharing a single database, it is unduly complex to prioritize and otherwise schedule resources for a selected subset of queries that might affect a subset of users or services. These and other problems create serious scalability issues for centralized STEM systems, and as a result it is becoming increasingly difficult to implement such systems in large-scale public or private clouds or using other types of distributed virtual infrastructure.
Another important drawback of the centralized STEM approach is loss of application context. Since the log information is transmitted to a single collection point, relevant application context may be lost. For example, the log information may not contain all of the disk or memory program context that existed at the moment the log record was persisted. By its nature, a log record is a very specific, limited summary of something that the application chooses to record. It is not everything that is known to the application when the log was written. More information, such as current power consumption, or the identification of other processes running on the host, may be relevant, but will not be logged natively by the application, and therefore will not be accessible to the centralized STEM system.
These and other issues are addressed in U.S. patent application Ser. No. 12/982,288, filed Dec. 30, 2010 and entitled “Distributed Security Information and Event Management System with Application-Injected Remote Components,” which is commonly assigned herewith and incorporated by reference herein. Embodiments disclosed therein provide a distributed SIEM system that comprises a centralized portion and a plurality of remote portions, with the remote portions being implemented in respective applications within information technology infrastructure. Each of the remote portions comprises one or more remote components inserted into the corresponding application. At least a subset of the remote components of the remote portion are configured for interaction with one or more corresponding centralized components of the centralized portion of the system.
In such an arrangement, remote components of the SIEM system may be injected directly into applications running on servers or other types of information technology infrastructure, which may comprise distributed virtual infrastructure. The distributed SIEM system is therefore more scalable, more responsive and more autonomic than the conventional centralized SIEM system. The distributed SIEM system can narrow the window between initial suspect activity, and the initial SOC response, while at the same time offloading some of the application-specific or routine correlation work from the centralized portion of the SIEM system. This allows the centralized portion of the SIEM system to monitor for activity at a higher level, correlating across a set of richer, application-context aware events.
Despite the advances provided by the above-described techniques, a need remains for further improvements in SIEM systems. For example, security analysts using such systems typically must perform manual security analysis each time a suspicious event occurs. For example, they will manually seek system-state data that looks out of the ordinary, and attempt to trace connections between events where appropriate. Accordingly, most of the actual understanding and process lies with the human agent interpreting the behaviors witnessed on a network.