Malicious software or “malware” has become a pervasive problem for corporations and individual users alike, as the functionality of most electronic devices is influenced by downloaded data. Normally, electronic devices communicate with trusted resources and download data that is free of malware. However, there are occasions where an electronic device communicates with a particular resource, even a trusted resource, but the electronic device receives downloaded data that contains malware. When loaded in an electronic device, malware may adversely affect its operability and may attempt to compromise a local network by attempting to propagate the malware from the electronic device to other electronic devices connected to the local network.
Given the increased presence of malware, the security vendors have been developing systems and tools to protect electronic devices by detecting a presence of malware within data received as input. Due to the continuous and growing volume of released malware, however, security vendors are faced with a substantial challenge in accurately classifying detected malware. Currently, malware is classified by mapping the received (input) data into pre-defined categories of malware (hereinafter “malware classes”). However, as samples may vary greatly from each other, especially due to increasingly sophisticated (and morphing) malware and inconformity with malware classification by security vendors, many conventional classification techniques are not designed to handle malware variants.
In fact, some classification techniques tend to experience a high level of false positives when attempting to cluster (group) different (but highly related) malware into malware classes. This high level of false positives may also be due, at least in part, to the presence of “white noise”; namely, the presence of benign (or non-pertinent) behaviors that have no influence in determining whether the malware should be classified as part of a certain class or not. A reduction of false positives by removal of the “white noise” would improve overall system reliability and speed of malware detection.