Networks such as the Internet, Local Area Network, Extranets and Intranets are common today. Networks typically comprise communication media, routers, network switches, and firewalls. Computers, such as client computers and servers, are connected to each other via networks.
Network security is important, especially when the network is connected to the Internet which is not secure. There are various types of malicious “intrusions” that can jeopardize a network. Examples of malicious intrusions are viruses, worms, denial of service attacks, and buffer overflow attacks.
There are various known techniques to protect against such intrusions. A firewall at the gateway to the network or within a computer can block (a) messages containing a known electronic signature of a computer virus or worm, (b) all messages from source IP addresses known from experience to be malicious, (c) messages containing words characteristic of spam, (d) some or all messages from a source IP address which is sending an unusually large number of messages to the same destination IP address, (e) all messages sent from a source IP address to a destination IP address which should not receive messages from this source IP address, (f) entire networks which are known to be malicious and (g) entire countries for which there is no legitimate reason to allow network traffic.
An electronic “signature” of a computer virus, worm or other malicious network activity is a series of bits known from experience to be present in the virus, worm or other malicious network activity. An intrusion detection sensor in a firewall, gateway computer or other network device scans incoming messages for the series of bits that comprise the signature of the virus, worm, or other malicious network activity. If this series of bits is found, then the intrusion detection sensor raises an alarm for inspection by security personnel, and in some cases, can block the virus, worm or other malicious network activity. One major problem with this type of intrusion detection system is the large number of false positive alarms generated by the sensors. A false positive results when an innocent message, by chance, includes the same series of bits and is mistakenly identified by the intrusion detection sensor as malicious activity. Consequently, many intrusion detection sensors are programmed to detect, but not block, messages containing malicious signatures, but simply notify a security analyst for further review to determine whether the detected flagged network traffic requires further action. After review, the security analyst can update a firewall to block subsequent attacks of this nature.
It was known to reduce the number of false positives based on meta alarms or rules which identify known patterns of alarms which have a high probability of representing true attack patterns in alarm streams, as follows. Vendors identify combinations of two or more signatures of two or more respective messages that will occur in certain types of attacks. For example, some attack messages are preceded by “reconnaissance” messages which probe for vulnerable ports, services or operating systems on the victim machine. Both the reconnaissance messages and the subsequent attack messages are characteristic signatures. Security personnel currently identify these combinations of signatures by manual inspection of alarm logs. Subsequently, if a security analyst receives intrusion sensor alerts that two or more messages with these two or more signatures have arrived from the same source IP address on the same day or within a predetermined time window, the security analyst will send an alarm that this source IP address is probably malicious. In response, there will be further investigation of this source IP address, and if the further investigation warrants, action can be taken to block subsequent messages from this source IP address. While this technique is effective, it is limited to predetermined combinations of signatures, and requires a high level of manual inspection to determine new combinations.
It was known to determine events that are associated or correlated to each other based on a “support” factor and a “confidence” factor derived from analysis of events in a set. The “support” factor is based on the frequency with which this combination of events appears in the set. The greater the frequency, the greater the “support” factor. The “confidence” factor is based on how close to a one-to-one relationship are numbers of the two events. For example if there were five groups which contain either of the two events under investigation, and four of the five groups contain both events, and the fifth event only contains one of the events, the number of groups in which the combination of the two events occurs, then the confidence level is ⅘ or eighty percent that these two events are correlated to each other because in four of five groups both events occurred. The closer the numbers match to one-to-one, the greater the “confidence” factor. If the confidence and support factors together are high enough, then events are considered correlated to each other as a combination.
It was also known to provide a table which lists for each destination IP address the source IP addresses of messages containing malicious signatures that were sent to this destination IP address. U.S. Patent Application “System, Method and Program Product for Visually Presenting Data Describing Network Intrusions”, Ser. No. 11/486,742” filed by James Treinen on Jul. 13, 2006 discloses a system which generates a graphical representation (comprising vertices representing IP addresses and edges representing malicious message flows including their direction) of destination IP addresses of a customer site, and the source IP addresses that sent each destination IP address messages which contain malicious signatures. From this graphical representation, a security analyst can identify source IP addresses that are sending to the customer site a large number of messages containing malicious signatures. When this occurs, it is likely that the source IP address is malicious.
A Knowledge Discovery in Database (“KDD”) process is also known. The KDD process comprises the following steps: (1) understanding the application domain, i.e. analyzing the possible data that can be generated by the application, and understanding the information that is contained in this data, (2) integrating and selecting data, i.e. selecting an appropriate set of data for analysis as a means of obtaining the appropriate end information, (3) mining data, i.e. the actual application of the automated data analysis, (4) evaluating patterns, i.e. inspecting resulting information by skilled analysts and (5) presenting knowledge, i.e. displaying the results in a consumable format for the end users. See “Data Mining for Intrusion Detection A Critical Review, by K. Julisch published in Applications of Data Mining in Computer Security in 2002.
While these techniques are effective in identifying malicious messages and reducing the number of false positives, further improvement can be made to further reduce false positives based on presence of malicious signatures.
An object of the present invention is to identify malicious messages based in part on presence of malicious signatures while reducing false positives.
Another object of the present invention is to automatically take corrective action against malicious messages.