Computer network security has become increasingly important over time as sensitive and confidential information are increasingly being transmitted across these computer networks. To that end, computers often provided secure environments to store such sensitive and confidential information by employing mechanisms that offer protection to the stored information. Unfortunately, with the increasing ubiquitous use of the Internet, the effectiveness of these security mechanisms are being compromised through a series of continuous malicious attacks directed toward gaining access to the sensitive and confidential information stored within secure environments provided by computers.
Blacklisting is a widely used defense practice against malicious traffic on the Internet. Blacklisting encompasses the compiling and sharing of lists of prolific attacking systems in order to predict and block future attacks. The existing blacklisting techniques have focused on the most prolific attacking systems and, more recently, on collaborative blacklisting.
More formally, given a blacklist of length N, a metric of a blacklist's predictiveness may be a hit count defined as the number of attackers in the blacklist that are correctly predicted (i.e., malicious activity from these sources appears in the logs in the next time slot). A blacklist with a higher hit count may be considered more “predictive” compared to a blacklist with a lower hit count.
Two upper bounds of prediction may be defined: a global upper bound and a local upper bound. For every victim v, the global upper bound on the hit count of v may be defined as the number of attackers that are both in the training window of any victim and in the testing window of v. Thus, the global upper bound may correspond to the case where the past logs of all victims are available to make a prediction, or when each victim shares information with all other victims. Further, for every victim v, the local upper bound on the hit count of v may be the number of attackers that are both in the training window and in the testing window of v. Thus, the local upper bound represents the upper bound on the hit count when each victim v only has access to its local security logs but does not have access to the logs of other victims. Because the local upper bound may be based upon less information than the global upper bound, it may have a lower total hit count than the global upper bound. Generally, the more predictive a blacklist is, the closer its hit count will be to the upper bounds.
Two blacklisting techniques are the Global Worst Offender List (GWOL) and the Local Worst Offender List (LWOL). GWOLs refer to blacklists that include top attack sources that generate the highest numbers of attacks globally, as reported at universally reputable repositories, while LWOLs refer to blacklists of the most prolific attack sources as logged by security devices deployed on a specific site. There are benefits associated with either list but relying entirely on them alone has many drawbacks. For example, while GWOLs include top attack sources that generate the highest number of attacks globally, it may include lists that may be irrelevant to particular victim networks. By contrast, relying on LWOLs may fail to predict attack sources that may not have previously attacked that specific site but are troubling attack sources globally. In addition, LWOL is essentially reactive but can be implemented by the operator of any network independently.
In another blacklisting technique called highly predictive blacklisting (HPB), the attention is shifted from attacker profiles to victim profiles. In other words, the future attacks are predicted based not only on a victim's own logs but also on logs of a few other “similar” victims. Similarity between two victims can be defined as the number of their common attackers, based on empirical observations made earlier. In this manner, predictive blacklisting can be posed as a link-analysis problem, with the focus being on relevance propagation on the victim-victim graph. However, this approach does not rely on the attacker profile as before and therefore is not all-inclusive.
As can be seen, while the aforementioned approaches for predicting malicious attacks may be able to capture some of the attack patterns of malicious devices, they still leave much room for improvement. Consequently, what is needed are improved systems and methods for blacklisting that can accurately and compactly predict future attacks while minimizing the number of false positives.