1. Field of the Invention
This invention relates to the field of data processing systems. More particularly, this invention relates to the detection of malware, such as, for example, computer viruses, worms, Trojans and the like, within computer programs.
2. Description of the Prior Art
It is known to provide malware detection systems that examine the code of a computer program to identify characteristics corresponding to known items of malware. These characteristics can be considered to be signatures of the viruses. Common approaches utilise binary search strings to look for these characteristics and checksums to detect the alteration of known computer programs.
The known techniques are not well suited to generically detect programs written in high level languages, such as C or VisualBasic. A problem with programs written in such high level languages is that if they are recompiled with other compilers or compiler options or the source code changed in a relatively minor manner, then the binary search strings needed to detect them are significantly altered. This alterations means that a signature developed to detect a particular variant of an item of malware written in a high level language will often fail to detect a minor variant thereof. As an example, if the source code for a Trojan is available on the Internet, then there often occur many dozens of variants of the Trojan which re-use some or all of the source code that has been made publicly available. Whilst the different items of malware so produced from the same source code have functional similarities, it is difficult with known techniques to develop a signature capable of detecting such variants.
The present invention addresses the problem of generically detecting groups of programs produced from the same source code.