The present invention generally relates to classification of software, including malware and unwanted software. More particularly, the present invention relates to identification of software based on identification of certain characteristics (hereinafter called “genes”) and matching such genes against certain created classifications defined as groupings of genes.
Malware is a general categorization of a computer contaminant including computer viruses, worms, Trojan horses, spyware and/or adware, for example. Unlike defective software which has a legitimate purpose but contains errors, malware is written to infiltrate or damage a computer system and/or other software. Malware may also steal sensitive information, such as passwords. Some malware programs install a key logger, which copies down the user's keystrokes when entering a password, credit card number, or other useful information.
Malware includes viruses and worms, which spread to infect other executable software and/or computers locally and/or over a network, for example. By inserting a copy of itself into the machine code instructions in these executables, a virus causes itself to be run whenever the program is run or the disk is booted.
Additionally, Microsoft Word® and similar programs include flexible macro systems receptive to macro viruses that infect documents and templates, rather than applications, through executable macro code.
Worms, unlike viruses, typically do not insert themselves into other programs but, rather, exploit security holes in network server programs and start themselves running as a separate process. Worms typically scan a network for computers with vulnerable network services, break in to those computers, and replicate themselves.
Another type of malware is a Trojan horse or Trojan. Generally, a Trojan horse is an executable program that conceals a harmful or malicious payload. The payload may take effect immediately and can lead to many undesirable effects, such as deleting all the user's files, or the payload may install further harmful software into the user's system. Trojan horses known as droppers are used to start off a worm outbreak by injecting the worm into users' local networks.
Spyware programs are produced for the purpose of gathering information about computer users.
Additionally, systems may become infected with unwanted software. Unwanted software is defined as being software that is installed or used without the system owner's permission. Although unwanted software is not malicious, it can either affect performance of client machines or potentially introduce security risks and related legal risks into an organization. Such unwanted software may include adware, dialers, remote administration tools and hacking tools.
Traditional malware protection techniques are based around anti-virus vendors creating signatures for known malware and products that scan systems searching for those specific signatures.
With this approach, an identification or definition of malware and/or unwanted software is released once a lab has seen and analyzed a sample of such software. This can mean that some users may be infected before the definitions have been released. Thus, systems and methods providing detection of unknown malware and/or unwanted software to help prevent users from being infected before a definition is released would be highly desirable.
The volume of malware has increased dramatically (around 140+ Brazilian Banking Trojans per day for example). Multiple variants of the same malware threat are relentlessly created and rapidly distributed, with the aim of defeating traditional signature-based virus protection.
Some anti-virus software uses heuristics to attempt to identify unknown viruses. Heuristics techniques look at various properties of a file and not necessarily the functionality of the program. This leads to high false positive rates.
Other behavior based technologies rely on running malware and attempting to stop execution if malicious behavior is observer to happen. By allowing malware to execute, the malware may already have caused damage before it is blocked. Additionally, behavior-based technology often requires extensive user interaction to authorize false positives.
The network security threats faced by enterprises today are much more complex than 20 years ago. The exponential growth in malware is compounded by its speed of propagation and the complexity of blended threats, changing the nature of the risks. The behavior of network users is also changing rapidly. There is a need for systems and methods for proactively classifying software before malware or unwanted software causes damage.