Malware, short for “malicious software,” is software that can be used to disrupt computer operations, damage data, gather sensitive information, or gain access to private computer systems without the user's knowledge or consent. Examples of such malware include software viruses, trojan horses, rootkits, ransomware etc. A common mechanism used by malware developers is to embed the malware into a file that is made to appear desirable to user, or is downloaded and executed when the user visits a web site. For example, malware may be embedded into a software application that appears legitimate and useful. The user downloads the file, and when the file is opened, the malware within the file is executed. A file that contains malware can be referred to as a malicious file.
Malware in a file typically exhibits some kind of pattern or signature in the file. Detecting malware in files typically involves determining if the file contains a pattern or signature that is associated with malware. However, in order to detect malware, the signature or pattern of the malware must be determined. In the past, this was typically a manual process involving an analyst running a candidate program to determine if it exhibited malware behavior, and then determining what the signature or pattern of the malware is. However, the number of new files requiring analysis can number in the hundreds of thousands per day. As a result, it is no longer practical to manually analyze new files to determine signatures or patterns for new malware.