In computer networking, domain names can help identifying locations where certain information or service are located on a public or private network. Domain names are typically formed according to rules and procedures of the DNS. Domain names can be used for various naming and addressing purposes. In general, a domain name can be resolved to an Internet Protocol (IP) resource, such as a personal computer, a server hosting website pages, or a website page or service communicated via the Internet. Thus, the DNS can allow translating domain names (such as “www.example.com”) into the corresponding IP address (such as “123.4.56.78”) needed to establish Transmission Control Protocol/Internet Protocol (TCP/IP) communication over the Internet.
Traditionally, DNS servers resolve (i.e., translate to IP addresses) domain names upon receiving DNS queries associated with domain names. When a DNS server receives a query from a client, the DNS server checks if it can answer the DNS query authoritatively based on local information available to the DNS server. If the queried domain name matches a corresponding resource record in a local cache, the DNS server answers authoritatively. If no local record exists for the queried domain name, the DNS server checks if it can resolve the domain name using locally cached information from historical data. If a match is found, the DNS server answers based on the historical data. If the queried domain name does not find a match at the DNS server level, the query process can continue with assistance from other DNS servers.
One of the important tasks for Internet Service Providers (ISPs), malware protection providers, marketing providers, and many other systems is identifying malicious activities, such as web-based security threats or botnets, at early stages. Malicious code authors use a variety of methods to prevent authorities and users from identifying security threat sources. These methods range from adaptive computer coding techniques to functions changing command and control (C&C) server locations using different infected computers. It may be difficult to detect certain malware operations, while the costs of continuous maintaining security measures, such as honeypots, and related infrastructure are high. Thus, fleeting and evolving nature of various web-based security threats requires new methods of identification of malicious domain names.