1. Field of the Invention
This invention pertains in general to computer security and in particular to techniques for storing signatures representative of malicious behaviors for an intrusion detection system.
2. Background Art
Intrusion detection systems (or intrusion prevention systems) examine network flows arriving at an enterprise to detect various unusual or suspicious data streams (e.g., viruses, hacker attacks, attempts to look for a vulnerability in the enterprise, or other indications of malicious activity). A network flow is defined by a tuple comprising an origination port, a destination port, and a communications protocol. A data stream is the stream of data packets sent over a network flow. Typically, an intrusion detection system stores a set of signatures representative of malicious behaviors. Each network flow may have its own unique set of signatures. The intrusion detection system detects an unusual or suspicious data stream by comparing a data stream with the stored signatures to determine if the data stream has a pattern that is characterized by one or more of the signatures.
To increase an intrusion detection system's efficiency, it is desirable to store a given network flow's signatures within a single data structure (e.g., a hash table) to minimize the lookup time for comparing the signatures with a data stream. Specifically, since there may be thousands of data streams arriving at an enterprise at any given second, inefficient signature lookup may cause the enterprise's users to experience a slow network connection. However, since there may be thousands of possible network flows directed to the enterprise, storing signatures for each network flow in a separate data structure may not be practical due to limited storage resources. In particular, creating a data structure such as a hash table for each network flow leads to a large signature file that is costly to distribute and to store in a memory. Thus, to preserve the storage resources, it may be desirable to create a data structure that stores multiple network flows' signatures. In other words, the intrusion detection system uses a small set of data structures that are globally shared among different network flows' signatures.
Storing multiple network flows' signatures in a single hash table creates an additional overhead. Specifically, the intrusion detection system uses the hash table to determine if a given byte sequence in a network flow matches a byte sequence of a signature for that network flow. However, since the hash table is shared among different network flows, if the intrusion detection system detects a match in the hash table, it needs to determine if the match identifies a signature for the current flow rather than for one of the other network flows that share the same hash table. In addition, storing signatures for different network flows in a single hash table may cause a signature for a network flow to be hashed to the same table entry as a frequently occurring byte in a different network flow. For example, in a hash table storing signatures for both file transfer protocol (FTP) flows and hypertext transfer protocol (HTTP) flows, a signature for the FTP flows may be hashed to the same table entry as 0x20, which is a byte that frequently occurs in the HTTP flows. In such a scenario, hash table lookups may lead to false hits during the scanning of a network flow, since the hash table may store another network flow's signature that has the same byte as the frequently occurring byte of the network flow being scanned.
Therefore, there is a need in the art for a technique to store intrusion detection signatures in such a way that reduces the number of unnecessary lookups during scanning without introducing additional storage overheads.