The term “malware” is short for malicious software and is used to refer to any software designed to infiltrate or damage a computer system without the owner's informed consent. Malware can include viruses, worms, Trojan horses, rootkits, adware, spyware and any other malicious and unwanted software. Many computer devices, such as desktop personal computers (PCs), laptops, personal data assistants (PDAs) and mobile phones can be at risk from malware. Computer systems running the Windows™ operating system are particularly at risk from malware, but all operating systems will be at some risk. Examples of other operating systems that could be at risk are Mac OS™, Linux™, Android™, iOS™, Windows Mobile™, and Blackberry OS™.
The current threat landscape shows a continuous increase in the number of malicious applications that are threatening the security of internet users. Combating these malicious applications is challenging due to the ability of malware authors to produce unique variations of their creations that are then distributed to victims through a variety of infection paths. The variations of the malware creations make it difficult for anti-malware to detect the malware by simply recognising it based on an earlier version of the malware stored in a database.
Anti-malware products traditionally rely on local scanning technologies or on full-file hash network queries. Local technologies are installed on the equipment of a user and include signature-based detection and heuristic detection. Signature-based detection is based on searching for known patterns of data within executable code, while heuristic detection is based on searching for mutations or refinements of known malware. Signature-based detection and heuristic detection rely on database updates for their knowledge of known malware and the database needs to be up-to-date at all times to maximise the efficiency of these methods. How generic the signature or heuristic detection schemes are affects the ability of the anti-malware product to provide protection to the user equipment.
Network-based full-file hash queries first apply a transform such as a cryptographic hash function to a malware program in order to obtain a hash value, which forms a unique representation of the malware program. The hash value is then used to identify the malware at the network by comparing the value against values in a database. Examples of transforms for generating hash values are MD5 and SHA-1. With this technique, the cryptographic hash of each scanned item is used to determine whether the item has been seen before, and, in such case, its reputation. This approach does not require a local database, but cannot aim to provide a generic level of protection due to the very nature of cryptographic hash methods because a very minor change in the malware will result in a completely different hash value.
Current anti-malware products rely mostly on a combination of local scanning techniques and network based hash look-ups. These mechanisms have each their problems, namely the former is heavily dependent of its database to provide a level of generic protection, while the latter lacks generality.