1. Technical Field
The invention relates to an indexing device and a method for providing an index for a stream of data.
2. Related Art
A group of computers can be interconnected in a computer communication network to exchange information between the computers. Frequently, such a network has connections to other networks, thereby forming a hierarchical computer communication network. In order to manage such a computer network, one common task is to analyze communication at certain points of the network.
In many cases, it is a requirement to store at least a portion of the communication at one of these points for later reference. For instance, propagation of malicious software (“malware”) such as viruses, worms or trojans can be tracked back to the computer from which it emerged by analyzing the traffic inside of a network or between networks. In another example, security-related communication can be post-analyzed, such as money transaction data or data related to the access of persons to a building. Depending on the size of the network and the communication activity of the computers, the number of messages exchanged between networks of moderate size can easily reach one million messages per second.
While means are known to store voluminous data communication generated by streams of such high throughput, analyzing the stored communication data remains a problem as vast amounts of data may have to be filtered for information of interest. To speed up a search for a certain pattern in the stored communication data, data indices are used. For this purpose, data records forming the stored communication data are split into header and body portions. The header includes information on a sender and a recipient of a message. A bitmap index is created for sender and recipient fields. The bitmap index is then compressed such that pattern-matching with search patterns containing Boolean operators, such as AND and OR, can be carried out on the compressed columns, for example, “records in which the sender's address is in range X AND the recipient's address is in range Y”. One way of such coding is published in U.S. Pat. No. 6,831,575.
It is an object of the present invention to provide an improved method for providing a compressed index for a stream of communication data. It is a further object of the invention to provide an indexing device for carrying out said method.