The amount of data to be processed increases very rapidly. This increasing amount of data could be found in almost every business field, especially in the area of computer network security. However other business fields use data bases managing large amount of data also.
Most often tabular displays of records are used for analyzing multivariable data sets. Each row in the table represents a record and a column of the table displays field values of the records. For larger numbers of records such tabular display will only contain a few records fitting on the screen, other records being reachable through interactive scrolling, by which the window of visible records is moved. One frequently needed analysis in such contexts is to identify groups of similar records having identical or close to identical values in at least one of their fields. Once such a group or block of similar records is identified, the second question is then how homogeneous the group is, that is, how well the records in the group match in their most important field values to be treated as closely related, or whether a more homogenous sub-group can be identified.
A usual method for identifying groups of records that have a particular field value in common is to sort the records in the table according to the given field value. Such sorting places all records with the same field value next to each other. Users can then inspect the list of records and their field values more closely and find out how many groups with different field values are present in the list and how homogeneous the identified groups are.
However, the difficulty with this approach is that it can be difficult to recognize the groups and assess their homogeneity if field values are visually similar. For example, if only a few digits in a long number are different, it can be difficult to spot these differences by glancing over the list of records. Therefore, to be really sure, users spend considerable time for detailed inspection of field values if they want to be sure that they have correctly assessed the equality or inequality of field values. Depending on the importance of the decision and the number of fields that can contain visually similar field values the effort needed for this group identification task can be a burden to users and slow them down considerably in their overall task.
With the expansion of the internet, electronic commerce and distributed computing, the amount of information transmitted via computer networks is continuously increasing. Such possibilities have opened many new business horizons. However, they have also resulted in a considerable increase of illegal computer intrusions. That is why intrusion detection has become a rapidly developing domain. An intrusion detection system is composed of hardware components and software components. The hardware components are used for receiving, processing and displaying the so-called events. An event is a multivariable data record having multiple data properties or fields. The events are monitored for determining if an attack or if a potential intrusion has occurred. Given the current state of network intrusion detection systems and event correlation technology the monitoring of events by human specialists is used for considerably reducing the number of false alarms that network-based intrusion detection system typically report.