Generally speaking, an Intrusion Detection System (IDS) is a system, which detects unusual and/or hostile activities in a computer network. IDSs detect and/or prevent activities that may compromise system security, and/or an attempted hacking of a component within the network while in progress. IDSs provide a view of unusual activity and issue alerts notifying administrators and/or block a suspected connection altogether.
The false positive rate is a fundamental metric used by the Intrusion Detection System (IDS) industry to measure the performance of an Intrusion Detection System. Under the current state of IDS, it is still difficult for an IDS product to obtain an absolute low false positive rate.
For an entity, there may be a huge amount of security data created by various Intrusion Detection System (IDS)/Intrusion Prevention System (IPS) systems. Analysts may analyze alerts data from the security data for long periods of time, including years. Such alert data may including a large number of false positives.
With a large number of false positives to analyze, true negatives are missed in the analysis. Currently there is no method or system for eliminating the false positive to improve the efficiency of the alerts.
Due to the characters of security alerts data set, the value of variables is more than often categorical rather than numerical, also, alerts may have many different attributes depending on the availability of background knowledge and the type of alert itself. This makes a classical supervised learning method such as a decision tree, a neural network, and RIPPER rule learner difficult to apply. The relevance of each independent variable to the target variable varies with the type of alerts, which make the traditional feature selection difficult. For example, even within a same data set, alerts may have different set of dependent attributes (features) based on the type of alerts, e.g., whether it is an alert on application vulnerability exploit or network scan.
With respect to type of IDS alert, generally, there are 4 outcomes for an alert:
True positive—IDS alert is identified correctly that it is an attack.
True negative—IDS alert is identified correctly that it is not an attack.
False positive—IDS alert is identified incorrectly as a true attack when it is not a true attack.
False negative—IDS alert is identified incorrectly as not an attack when it is a true attack.
Finding an effective method to learn from the history training data, and thus improving the performance of an Intrusion Detection System and analyzing process is needed.