Computer viruses may generally have family characteristics, and one certain computer virus may evolve into a variant virus. A variation of a virus may be targeted for an anti-antivirus process in some antivirus software. Generally, such a virus and viruses derived from it may have family characteristics. Anti-virus software may classify virus families according to the family characteristics of the viruses, in which a common feature shared by all viruses of the family is extracted as a criterion for virus determination, allowing identifying all of the viruses in the family by only one record in the virus library. It can be seen that an accurate clustering of virus families may substantially enhance efficiency of the antivirus software and to identify and purge viruses, and further, reduce the size of the virus library.
An existing cluster classification of viruses may be performed manually in which virus files are dynamically run and dynamic behavior features of the viruses are manually recorded and analyzed. Behavior features of the viruses may include modification to a system sequence for calling application programming interfaces (APIs) of the system, registry and creation of a file at a sensitive position. Manual cluster classification may be performed by accounting such dynamic behavior features.
Disadvantages of performing cluster classification of viruses manually may include the following. Substantial human resources involving virus analysis may be needed. Further, virus analysts, typically, should be very experienced. Cluster classification of viruses may demand dynamically running or executing the virus files to observe the dynamic behavior features of the virus files. This may not only increase consumption of time and computer resources, but also may increase risks of virus infection in the local computer during dynamic execution of the virus files.