Over the last decade, malicious software (malware) has become a pervasive problem for Internet users. In some situations, malware is a program, file, or digital data object that is embedded within downloadable content and designed to adversely influence (i.e., attack) normal operations of a computer. Examples of different types of malware may include bots, computer viruses, worms, Trojan horses, spyware, adware, or any other programming that operates within the computer without permission.
For instance, content may be embedded with objects associated with a web page hosted by a malicious web site. By downloading this content, malware causing another web page to be requested from a malicious web site may be unknowingly installed on the computer. Similarly, malware may also be installed on a computer upon receipt or opening of an electronic mail (email) message. For example, an email message may contain an attachment, such as a Portable Document Format (PDF) document, with embedded executable malware. Also, malware may exist in files infected through any of a variety of attack vectors, which are uploaded from the infected computer onto a networked storage device such as a file share.
As development of malware has progressed, hackers have developed malware that share similarities with other malware objects, but maintain some dissimilarities. Accordingly, these “similar” malware objects may be in the same malware family, but traditional malware and anti-virus protection systems may fail to properly classify each object in the family as malware based on these differences. For example, traditional malware detection and classification techniques may employ a direct comparison of a suspect object with known malware objects in an attempt to reveal an exact match. However, if the suspected malware object has not been previously detected and analyzed (e.g., zero-day malware threats), these direct comparison techniques will fail to classify the object as malware even if “similar” objects have been previously classified as malware. Accordingly, traditional malware classification and analysis techniques may prove inaccurate and inefficient as these techniques do not accommodate for small difference between malware objects within a family of malware.