Nowadays the amount of data to be processed increases very rapidly. This increasing amount of data could be found in almost every business field, especially in the area of computer network security. However other business fields use data bases managing large amount of data also.
The invention could be applied in the field of interactive support systems for security event monitoring and in particular systems supporting centralized monitoring of security events and alarms generated by a multiplicity of sensors such as intrusion detection systems. However further application areas are the analysis of data in online analytical processing of marketing data.
Most often tabular displays of records are used for analyzing multivariable data sets. Each row in the table represents a record and a column of the table displays field values of the records. For larger numbers of records such tabular display will only contain a few records fitting on the screen, other records being reachable through interactive scrolling, by which the window of visible records is moved. One frequently needed analysis in such contexts is to identify groups of similar records having identical or close to identical values in at least one of their fields. Once such a group or block of similar records is identified, the second question is then how homogeneous the group is, that is, how well the records in the group match in their most important field values to be treated as closely related, or whether a more homogenous subgroup can be identified.
A usual method for identifying groups of records that have a particular field value in common is to sort the records in the table according to the given field value. Such sorting places all records with the same field value next to each other. Users can then inspect the list of records and their field values more closely and find out how many groups with different field values are present in the list and how homogeneous the identified groups are.
However, the difficulty with this approach is that it can be difficult to recognize the groups and assess their homogeneity if field values are visually similar. For example, if only a few digits in a long number are different, it can be difficult to spot these differences by glancing over the list of records. Therefore, to be really sure, users spend considerable time for detailed inspection of field values if they want to be sure that they have correctly assessed the equality or inequality of field values. Depending on the importance of the decision and the number of fields that can contain visually similar field values the effort needed for this group identification task can be a burden to users and slow them down considerably in their overall task.
With the expansion of the Internet, electronic commerce and distributed computing, the amount of information transmitted via computer networks is continuously increasing. Such possibilities have opened many new business horizons. However, they have also resulted in a considerable increase of illegal computer intrusions. That is why intrusion detection has become a rapidly developing domain. An intrusion detection system is composed of hardware components and software components. The hardware components are used for receiving, processing and displaying the so-called events. An event is a multivariable data record having multiple data properties or fields. The events are monitored for determining if an attack or if a potential intrusion has occurred. Given the current state of network intrusion detection systems and event correlation technology the monitoring of events by human specialists is used for considerably reducing the number of false alarms that network-based intrusion detection system typically report.
To perform this task more efficient and effective, human operators are supported in their task. However before an event is visualized it is processed by means of a pattern detection algorithm. This pattern detection algorithm enables to detect whether an arrived event is part of a given pattern on the basis of a comparison of the fields allocated to this given pattern and the fields associated to the arrived event. After using that kind of pattern recognition for filtering the arriving events, the detected events or alarms are visualized or displayed.
The alarms are generated by a multiplicity of sensors, wherein these sensors generate a large number of ‘false positive’ events, that are events that are not actual indications of a threat to a network. To determine whether an event or set of events can be classified as ‘false positive’ operators inspect one or more of the different properties of the events under investigation. Examples of intrusion event properties include source-IP, destination-IP and alarm type, etc.
Typically operators in such centralized security operation center monitor a number of sensors in parallel. This number of sensors generates a number of security events which are studied to determine whether they imply a potential threat. Frequently operators will try to assess events at a level of groups or blocks of events which have at least one property in common. For this, operators sort the events in the table according to one of the event properties so that events with the same field value are moved next to each other. They can then investigate the resulting blocks of similar events and in many cases deal with them at the level of event-groups, which is faster than at the level of individual events. Often there will be more events to process than can be displayed on a single screen, and even a single block of events with the same field value might spawn more than a screen. In some situations it is important to operators to know the relative size of the currently viewed block of events compared to other blocks, to know how many different field values (and therefore blocks of events) are represented in the current list of events, and to find the largest block of events. With state of the art tools for security event monitoring operators scroll through the whole list of events to gain the needed overview to be able to answer these questions.