Each day the number of malicious applications spreading on the Internet is becoming greater and greater. For the protection of computing devices against malicious applications, antivirus solutions are often used, employing one or more methods of detection, such as signature or heuristic analysis, to detect malicious applications such as those downloaded from the Internet.
Yet the methods of detection also have limitations and inadequacies: heuristic analysis may not be used for all types of files, and signature analysis may not be effective for detection of polymorphous malicious applications—applications executing the same commands, but differing in their content of corresponding files of the applications. Such polymorphous malicious applications (and, in particular, the files of these applications) are often created in an automated manner (for example, they are generated automatically): the creator of the malicious application generally uses special development means which can compile from a single source code of the malicious application an enormous number of malicious files, which will have a different file body (file content), yet the applications launched from such files will behave in the same way. Improving the quality of detection of such files by using antivirus solutions often relies on determining the similarity of such files (the resemblance of the files in terms of one of the metrics of similarity). It should be noted that such polymorphous malicious files include not only files of PE (Portable Executable) format, but also any other files whose format allows the embedding of malicious code in a file, which will be executed in one way or another, such as files of the Portable Document Format, Microsoft Compound File Binary (OLE2 files) or one of the Office Open XML formats (DOCX, PPTX and others).
Although the known approaches are directed at solving certain problems in the area of protection of computing devices, they may not tackle the problem of detection of malicious compound files or they do so with insufficient effectiveness. The present invention enables a more effective solution to the problem of detection of malicious compound files.