The field of security information/event management (SIM or SIEM) is generally concerned with 1) collecting data from networks and networked devices that reflects network activity and/or operation of the devices and 2) analyzing the data to enhance security. For example, the data can be analyzed to identify an attack on the network or a networked device and determine which user or machine is responsible. If the attack is ongoing, a countermeasure can be performed to thwart the attack or mitigate the damage caused by the attack. The data that is collected usually originates in a message (such as an event, alert, or alarm) or an entry in a log file, which is generated by a networked device. Exemplary networked devices include firewalls, intrusion detection systems, and servers.
Each message or log file entry (“event”) is stored for future use. Stored events can be organized in a variety of ways. Each organizational method has its own advantages and disadvantages when it comes to writing event data, searching event data, and deleting event data.
Consider the following scenario: Each event includes an attribute called event receipt time. Since the value of the event receipt time attribute is frequently used for searching, store events based on their event receipt times. For example, create one file for each minute of the day. In order to store an event, determine that event's event receipt time. Append the event to the file that corresponds to that minute of event receipt time.
When subsequent events arrive, their event receipt times will always increase monotonically. This means that writing the subsequent event data will require only append operations. No seeking of the storage medium is necessary. This makes for good efficiency in writing the event data. In order to search the event data based on event receipt times, once the first event has been identified, the subsequent events are available by reading the storage medium in order. Again, no seeking is necessary. This makes for good efficiency in searching the event data based on event receipt time. In order to delete the oldest event data, the oldest files are deleted. If the oldest file is always deleted first, then the storage medium will not become fragmented. This makes for good efficiency in deleting the event data.
The problem with this approach is that searching the event data based on any attribute other than the event receipt time is very time consuming. For example, assume that each event also includes an attribute that indicates the device or application that generated the event (“event source”). In order to search the event data for events that indicate a particular event source (i.e., events that include a particular value for the event source attribute), the entire storage medium will have to be reviewed. This is very inefficient.
What is needed is a way to store security information/events efficiently while supporting querying for different event attributes.