The structure of today's programs, including malicious ones, is a complex set of instructions: transitions, procedure calls, cycles, etc. It should be noted that the complexity of executable files is constantly increasing, which is due to the growing popularity of high-level programming languages and to the sophistication of computer equipment and operating systems. Malicious applications can perform a number of specific actions, such as: stealing passwords and other confidential user data, connecting a computer to a bot network in order to carry out denial of service (DoS) attacks or send spam, interfering with the proper functioning of the system in order to extort money from the user with promises to restore operability (e.g., ransomware), and other actions, negative and undesirable from the user's point of view.
One of the known methods for examining a potentially malicious program is based on the use of an emulator applied as part of an antivirus application to analyze program behavior. There are various methods of program emulation. In one approach, the emulator is programmed to imitate an actual processor, memory and other devices by creating virtual copies of the registries of the processor, memory and processor instruction set. This way, program instructions are executed not on an actual processor, but on its virtual representation, in which system API function calls are intercepted in the emulator and imitated, e.g., expected replies are sent back to the emulated application.
During emulation, the execution of processor instructions is typically carried out by dynamic translation of instructions. Dynamic translation involves translating the instructions from an initial set (i.e. the original instructions to be emulated) into a dedicated set of instructions to be executed using the emulator. Dynamic translation is discussed below using the translation of one instruction as an example:
Initial instruction:
mov eax, [edi]
The translated pseudocode involves the following set of steps:
1. Reading the edi
2. Reading the memory at the address received in the 1st operation
3. Writing the value read from the memory in the 2nd operation to eax
In addition, each step of such pseudocode will contain a certain number of machine instructions; as a result, one initial instruction, when translated, causes the execution of tens or even hundreds of instructions in the processor. It should be noted that, once translated, the code does not need to be translated again at another execution, because the code translation operation has already been executed. Taking into account that most of the code is executed within cycles, dynamic binary translation is a well-known and ubiquitous technique.
To counter program code emulation, creators of malicious programs use various approaches, which tend to exploit limitations of the emulation process and to the design of the emulator in antivirus solutions. One of these approaches involves adding a large number of instructions to the program code, which do not carry a malicious component but require excessive time for emulation. Taking into account the fact that the time allocated for the program code emulation is limited to avoid user dissatisfaction (this time can usually be a few seconds), the emulation process can stop before the execution of the malicious code.
One of the techniques for countering such an approach is described in U.S. Pat. No. 7,603,713, the disclosure of which is incorporated by reference herein. Its operation includes the execution of a number of instructions on an actual processor, rather than in an emulator utilizing dynamic binary translation, thereby significantly accelerating the emulation of unknown applications.
Although this approach can be quite beneficial to reducing the time needed to execute large numbers of instructions, certain drawbacks remain. One such drawback relates to the fact that the accelerated execution of instructions using an actual processor stops, for example, when an exception e.g., having to respond to an API function call. Taking into account that the emulation accelerator needs initialization, which tends to be a resource-consuming process, the accelerator can be of only marginal benefit, or even counter-productive, in certain cases where it executes only a few instructions before having to return execution back to the usual emulator.
Accordingly, there is a need for an effective solution that improves the efficiency of emulation acceleration.