Field of Invention
The present disclosure relates generally to Internet security. More particularly, the present disclosure relates to determining whether an Internet domain name is indicative of a malicious Internet domain by measuring the typicality of the Internet domain name relative to known typical domain names.
Description of the Related Art
When a user accesses a malicious Internet domain using a device such as a computer, the security of the device may be compromised. When the security of a device is breached as a result of accessing a malicious Internet domain, confidential information may be compromised, and the device may be rendered non-operational.
Some domain names are “botnet” domain names, and are typically characterized by a random combination of characters. As will be appreciated by those of skill in the art, a botnet includes more than one computer, e.g., more than one “bot,” that is controlled by a malicious party. Botnets propagate malicious software across a network, e.g., the Internet. If a botnet domain name may be identified as being a malicious domain name, i.e., a domain name associated with a malicious domain, before a user accesses the malicious domain using a device in his possession, the user may effectively be saved from compromising the security of his device. Methods used to identify botnet domain names or otherwise detect botnet domains directly detect the botnet domain names using bigrams.
Some methods that identify botnet names using bigrams may construct a statistical model of bigrams that are likely to appear in botnet names. Typically, the bigrams include strange combinations of letters from a linguistic standpoint. The approach of constructing a statistical model of bigrams that are likely to appear in botnet names is limited, as there are far more possible strange combinations of letters than there are natural combinations. Methods used to identify botnet domain names are generally not effective in identifying other types of malicious or evil domain names. The methods used to identify botnet domain names generally do not effectively identify domain names which include misspellings of common words and randomly combined dictionary words. Domain names which include suggestive words, e.g., words relating to pornography, and domain names which are designed to lure unsuspecting users, e.g., “antivirus-protection-download,” are also generally not identified as being malicious by the methods used to identify botnet domain names.