The aim of an IDPS is to protect systems, computers, networks and network-connected devices from a variety of attacks threatening their confidentiality, integrity and availability. The Internet is an active ecosystem which evolves rapidly and constantly changes while new types of attacks emerge as the attackers become more sophisticated. In this context, an IDPS needs to be constantly updated in order to detect novel attacks.
IDPSs can be classified into two major categories namely anomaly detection and prevention systems and misuse and prevention detection systems. Anomaly detection and prevention systems are designed to identify deviations from a normal profile behavior in order to detect malicious actions. Even though this kind of system performs better in detecting previously unseen attacks, they suffer from a high False Positive rate rendering them unpractical solutions for protecting a sensitive infrastructure.
With a misuse IDPS, the detection process is based on known signatures or, in other words, detection rules aiming to distinguish legitimate traffic instances from the malicious ones.
Currently, state of the art approaches are able to generate rules for detecting popular classes of attacks, but significantly neglect the minority attack classes. Even if these types of attacks are less common, their impact on the targeted system is considered to be destructive. Attacks such as remote vulnerability exploitations or privilege escalation could lead to a system becoming compromised by an attacker or confidential information leaks, causing financial losses and harming the trustworthiness of the organization.
Analyzing network traffic flows in the context of IDPS is a challenging task mainly because of the nature of the network traffic data. Under realistic terms, a network is flooded with normal traffic flows and only a smaller fraction of the traffic may indicate malicious behavior. This leads to a highly unbalanced data set that is difficult to analyze. In addition, a network analysis process focuses on several features that have to be taken into consideration for distinguishing legitimate from malicious traffic. The aforementioned data properties combined with the numerous attack types introduce many challenges and affect the detection accuracy to a great extent. In short, in many settings, an IDPS is tasked to function with datasets that are characterized by:                Being multi-classed (several types of attacks),        Being multi-featured (several network traffic attributes), and        Being highly un-balanced (many instances of normal network traffic, but very few instances of rare attacks).        
Elhag, Salma, et al., “On the combination of genetic fuzzy systems and pairwise learning for improving detection rates on Intrusion Detection Systems,” Expert Systems with Applications 42.1 (2015): 193-202 describe complex classification techniques in the context of Fuzzy Rule Based Classification Systems. However, even using such complex classification techniques, which consume a great deal of computing power in comparison with embodiments of the present invention, only an 89.32% of average accuracy for the aforementioned attacks can be achieved. Additionally, this system cannot be exploited without de-fuzzing steps and considers only a subset of the search area.
Generally, state of the art approaches apply either sampling techniques on the datasets to come up with a subset with specific characteristics or remove redundant instances. In contrast, in an embodiment discussed below, the present invention advantageously uses all available data to infer attacks.