Traditionally, one of the tasks of the antivirus industry is to keep antivirus databases up-to-date. In the short time between when a malicious application has been released and when it has been detected, it can be downloaded hundreds of thousands of times by different users and can infect a large number of computers. Therefore, timely updates of antivirus databases allow malicious software to be countered adequately and quickly. However, it should be noted that the amount of software, including malicious software, is constantly growing, which requires proactive methods (heuristic analysis, code emulation, behavior analysis, etc.) for detecting such applications. To counter unknown malicious applications, antivirus providers have used heuristic detection methods, execution of unknown applications in protected environments (sandbox, honeypot) using virtualization, as well as various methods that limit the functionality of applications based on the analysis of their activity, for example, using a Host-based Intrusion Prevention System (HIPS).
However, all of the aforementioned methods have deficiencies, due to both the specifics of their operation and their use in antivirus applications wherein the user may apply settings which inhibits full use of these technologies. For example, in a situation where an unknown application is launched, a significant amount of processor time and computer resources are needed to validate the unknown application. Often, prior to the check of the unknown application, the user will disable its execution in a protected environment (e.g. on a virtual machine) or reduce the time allocated for emulation in order to utilize those resources for other computing tasks.
Due to possible risks of inefficient operation of the aforementioned proactive technologies and due to constant increase of the number of malicious applications, so-called whitelists (databases of clean files) are becoming more and more popular. Whitelists are created for objects such as files, applications, links, email address owners, as well as for user accounts in instant messaging systems, messaging logs, IP addresses, host names, domain names, and so on. Such lists can be built based on many factors. For example, the presence of electronic digital signature or other manufacturer data, data about the source (where the application was received from), data about the application relationships (e.g. parent-child relationship), data about the application version (for example, the application can be considered verified based on the fact that the previous version was also in the whitelist), data about environment variables (operating system, launch parameters), etc. can all be utilized.
Before each release to updates of signatures for antivirus databases, the release must be checked for possible overlap with the whitelist of files. Currently, the majority of unknown executable files being investigated are so-called PE files (Portable Executable files) which have a PE format (for the Windows operating system family—the operating system that has induced a majority of malicious software). A PE file includes a heading, various sections that constitute an image of the executable application, and an overlay, which comprises the segment that is additionally loaded if needed during execution.
Various parts of a file can be used to create a signature for a file. Most often, a code segment is used to create a signature. However, situations often occur when an expert erroneously interprets a library code or another widely used code segment as part of a malicious one, because this fragment is present in the malicious application. In such a case, a signature is thus erroneously created that contains a file fragment that can be present in a large number of clean files (for example, a fragment of a dynamic library). This signature can be successfully detected in a malicious application, but also in clean files that contain the code segment. In such situations, the use of the signature causes a false activation, because it is detected in a clean file.
The rules, templates, lists, signatures, (often created by an expert), etc. that are used by antivirus applications all constitute antivirus records. Such antivirus records allow for the detection and removal of malicious software. However, generation of antivirus records often include human error, such as expert making a mistake by creating, for example, a signature that considers clean software, specified in the whitelist of files, to be malicious. Further, other sources of error besides experts are included. For example, systems for the automatic building of antivirus records, when trying to detect as much malicious software as possible, inevitably include some clean applications as well. Situations can arise where certain (non-malicious) software required by a user is blocked by the antivirus application that uses the erroneous antivirus records. As a result, the user might be frustrated and might question the particular antivirus application.
Therefore, the tasks of timely detection and elimination of false activations are important for the antivirus industry. Today, various approaches are known that allow the reduction of the number of false activations. For example, International Application Pub. No. WO2007087141 describes a method for reducing the number of false activations. Methods described include multiple checks, first using a list of malicious files, then using a list of clean files. However, the disclosed methods do not allow for the detection of false activations for antivirus records not contained in the list of malicious files or in the list of clean files. Therefore, existing technologies are inefficient and in some cases, unworkable to find false activations.