Malware applications are increasing in number for both desktop and mobile platforms. Besides increasing in number, malwares are evolving and malicious developers discover new ways of bypassing regular antivirus/antimalware applications which provide protection by static analysis of malware applications. Detection of zero-day malware is an important problem for desktop as well as mobile security products.
In order to detect malicious applications most security products employ a signature-based approach. The signatures are usually created along with a static analysis of the application file, thus contain only static properties of the file. The basic and simplest form of a static signature is the hash value of the application file, which is created using CRC and/or MD5 hashing methods.
More generic signatures are also created by applying reverse engineering techniques such as decompiling and disassembling the executable, extracting of source code etc. While analyzing generated by reverse engineering techniques outputs, a malware analyst tries to identify the most distinguishable parts of the code that exhibits malware behaviour, and creates a signature based on string matching and/or regular expression matching techniques. Different signatures created in this way are collected and listed one by one in a file which is called “signature database”. This database file is sent to client applications. Further for each file analyzed, antivirus application either performs a check directly on the file or extracts the file being processed and decompiles the output where possible. Then antivirus opens the signature database and processes each signature one by one, checks the signature against the decompiled output and performs a comparison of signature (simple CRC/MD5 signature or generic/complex signature based on regular expression matching). If the signature matches the decompiled output, detection occurs.
In order to perform detections for newly created malware applications, it is important to keep signature database up-to-date on the client side.
These methods create a pattern of string based on the extracted, decompiled output of a malware application that has already been identified as malicious. Thus, the detection of a malware initially depends on a previous encounter of the application, in which it already performed its malicious behaviour without being detected. For this reason the main problem of current signature-based antivirus protections is that they first allow malicious behaviour to happen, then develop a signature for that behaviour and try to detect further occurrences in other computers/mobile devices.
On the other side, the signatures created out of static properties of a malware application are weak and insufficient to detect even small changes on the target file. Hash based signatures such as CRC and/or MD5 are created solely for a single file, and are able to detect only that file. It is not possible to match a range of malicious application that exhibit similar behaviour, which is called a malware family, because the hash-based signature is only a hash output of one application and thus cannot be matched for other similar files in the same malware family. Moreover, if the same application is compiled twice and only one bit of information is different after two compilations, their hashes will be different too, thus a signature created for the first application will not detect the second one.
Generic signatures aim to overcome this problem and try to identify different parts of decompiled/disassembled code of the malware application. They contain parts of the code which are distinctive enough and those parts are formed in a regular expression so that they are able to match all files that contain the same code parts. This way not only a single malicious application is detected but a whole collection of malwares that exhibit similar behaviours could be detected.
However generic signatures method has certain disadvantages. For example, there is inability to track the application at runtime, thus only one check is performed at a specific time (usually at the installation phase) and if the malware application passes that check, then it may show malicious behaviour at runtime. But in order to hide from antivirus applications malicious parties generally utilize advanced techniques such as encrypting sensitive data, changing method/class names, etc. In such cases antivirus application cannot detect the malicious application.
On the other hand, the generic signature may create false-positive detections. In order to find possibly all malwares of a family, the generic signature should be defined in a broad sense, containing any feature likely to be included in different malwares of family. Unfortunately, creating a signature that is defined to include wide properties of a malware family may end up with detection of clean applications as well, which creates false-positive detection. Thereby the antivirus application makes a wrong decision which does not only mislead the user, but also blame and obstruct the clean application to be installed (or warned about it), a result that is highly unwanted for antivirus products.
The third problem with generic signatures is their incomplete ability to catch different variants of the same malware family. Since the broad definition of a malware generates many false-positive results, the malware analysts are obliged to create precise definition for signatures which results with an output signature that may be unable to detect new variants of the same malware.
Other dynamic behaviour-based malware detection systems employ only detection on virtual environment and do not provide a “signature based on runtime dynamic behaviours” that is also used on mobile device in detection and prevention of the malware.
Thus, there is need in new method that would provide more effective detection and prevention of malware applications thus creating a trusted online experience.