The Internet enables a user of a client computer system to identify and communicate with millions of other computer systems located around the world. A client computer system can identify each of these other computer systems using a unique numeric identifier for that computer called an “IP address.” When a communication is sent from a client computer system to a destination computer system, the client computer system typically specifies the IP address of the destination computer system in order to facilitate the routing of the communication to the destination computer system. For example, when a request for a World Wide Web page (“Web page”) is sent from a client computer system to a Web server computer system (“Web server”) from which that Web page can be obtained, the client computer system typically includes the IP address of the Web server.
In order to make the identification of destination computer systems more mnemonic, a Domain Name System (DNS) has been developed that translates a unique alphanumeric name for a destination computer system into the IP address for that computer. The alphanumeric name is called a “domain name.” For example, referring to FIG. 9, the domain name for a hypothetical computer system operated by IBM Corporation may be “comp23.IBM.com”. Using domain names, a user attempting to communicate with this computer system could specify a destination of “comp23.IBM.com” rather than the particular IP address of the computer system (e.g., 198.81.209.25).
A user can also request a particular resource (e.g., a Web page or a file) that is available from a server computer by specifying a unique Universal Resource Indicator (“URI”), such as a Uniform Resource Locator (“URL”), for that resource. A URL includes a protocol to be used in accessing the resource (e.g., “http:” for the HyperText Transfer Protocol (“HTTP”)), the domain name or IP address of the server that provides the resource (e.g., “comp23.IBM.com”), and optionally a path to the resource (e.g., “/help/HelpPage.html”). Thus “http://comp23.IBM.com/help/HelpPage.html” is one example of a URL. In response to a user specifying such a URL, the comp23.IBM.com server would typically return a copy of the “HelpPage.html” file to the user.
In addition to making the identification of destination computer systems more mnemonic, domain names introduce a useful layer of indirection between the name used to identify a destination computer system and the IP address of that computer system. Using this layer of indirection, the operator of a particular computer system can initially associate a particular domain name with a first computer system by specifying that the domain name corresponds to the IP address of the first computer system. At a later time (e.g., if the first computer system breaks or must be replaced), its operator can “transfer” the domain name to a second computer system by then specifying that the domain name corresponds to the IP address of the second computer system.
The domain names in DNS are structured in a hierarchical, distributed database that facilitates grouping related domain names and computers and ensuring the uniqueness of different domain names. In particular, as mentioned above, a particular domain name such as “IBM.com” may identify a specific host computer. However, the hierarchical nature of DNS also allows a domain name such as “IBM.com” to represent a domain including multiple other domain names each identifying computers (also referred to as “hosts”), either in addition to or instead of identifying a specific computer.
FIG. 9 illustrates a hypothetical portion of the DNS database 900 in which the node representing the IBM.com domain name 910 is the root node in an IBM.com domain 950 that includes 7 other nodes each representing other domain names. Each of these domain names in the IBM.com domain can be, but do not have to be, under the control of a single entity (e.g., IBM Corporation). FIG. 9 also includes a WebHostingCompany.com domain 955 that includes a single domain name.
As illustrated, the DNS database can be represented with a hierarchical tree structure, and the full domain name for a given node in the tree can be determined by concatenating the name of each node along the path from the given node to the root node 901, with the names separated by periods. Thus, the 8 nodes in the IBM.com domain represent the domain names IBM.com 910, foo.IBM.com 912, foo.foo.IBM.com 918, bar.foo.IBM.com 920, bar.IBM.com 914, comp23.IBM.com 916, abc.comp23.IBM.com 922, and cde.comp23.IBM.com 924. Other “.com” domain names outside the IBM.com domain are also illustrated in FIG. 9, including the second-level domain names BCD-Corp.com 932, WebHostingCompany.com 934, 1-800-555-1212.com 942 and 123456.com 944, and the lower-level domain names 123.123456.com 946 and 456.123456.com 948. In addition to the “.com” top-level domain (“TLD”), other TLDs are also illustrated including the “.cc” geographical TLD and the “.gov”, “.edu” and “.mil” organizational TLDs. Illustrated domain names under these other TLDs include Stanford.edu 936, Berkeley.edu 938, and RegistrarCompany.cc 940.
New domain names can be defined (or “registered”) by various domain name registrars. In particular, a company that serves as a registrar for a TLD can assist customers in registering new domain names for that TLD and can perform the necessary actions so that the technical DNS information for those domain names is stored in a manner accessible to name servers for that TLD. Registrars often maintain a second-level domain name within the TLD (e.g., a hypothetical Registrar Company that acts as a registrar for the “.cc” TLD could maintain the RegistrarCompany.cc domain name 940), and provide an interactive Website at their domain name from which customers can register new domain names. A registrar will typically charge a customer a fee for registering a new domain name.
For the “.com”, “.net” and “.org” TLDs, a large number of registrars currently exist, and a single shared registry (“the Registry”) under the control of a third-party administrator stores information identifying the authoritative name servers for the second-level domain names in those TLDs. Other TLDs may have only a single registrar, and if so that registrar could maintain a registry for all the second-level domains in that TLD by merely storing the appropriate DNS information for each domain name that the registrar registers. In other situations, multiple registrars may exist for a TLD, but one of the registrars may serve as a primary registrar that maintains a registry for each of the second-level domains in that TLD. If so, the secondary or affiliate registrars for that TLD supplies the appropriate DNS information for the domain names that they register to the primary registrar. Thus, the manner in which the DNS information for a TLD is obtained and stored is affected by the registrars for that TLD.
Users of the aforementioned DNS generally do not communicate directly with a Root DNS Server. Instead, as illustrated in FIG. 10, DNS resolution takes place transparently in applications programs such as web browser and other Internet applications at the PC level. When an application makes a request that requires a domain name lookup, such programs send a resolution request to the DNS resolver in the local operating system, which in turn handles the communications required.
The DNS resolver often has a cache containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program making the request. If the cache does not contain the answer (or the information has expired), the resolver will typically send the request through a series of network devices to one or more designated DNS servers. In the case of most home users, the Internet Service Provider (ISP) to which the machine connects will supply this DNS server. In any event, the name server thus queried will follow the process outlined above until it successfully finds a result or determines that none is available. It then returns any results to the DNS resolver, the resolver caches the result for future use and passes the result back to the software which initiated the request.
In the case of a domain that is not registered, a corresponding domain resolution request will need to traverse to the level of an Authoritative Root DNS Server. The Root DNS Server will reply with an authoritative response of a “non-existent domain”. Requests to resolve such non-existent domains are retained in an external repository. FIG. 8 illustrates an exemplary set of logs to resolve such domains. NXDomains (or NXD) is a term used for the Internet domain name that is unable to be resolved using the DNS implementation owing either to the domain name not yet being registered or a server problem. The reference to the NXDOMAIN is published in RFC 1035 (Domain names—implementation and specification) and also in RFC 2308, both of which are incorporated herein by reference in their entireties.
FIG. 10 is a network diagram illustrating interconnected network devices and Domain Name System information. Root DNS Servers 1001 and 1002 and NXDomain Log Servers 1004 and 1005 interface with server 1003 to service requests from Upstream Provider DNS Server 1006. DNS Server 1006 is accessed by ISP DNS Server 1007 to service requests initiated by, for example, PC 1009 running a suitable client or a malicious Internet Bot connecting through Router 1008.
Because domain name resolution provided by DNS is essential to operation of the Internet and email, continual availability, operation and functioning of the system is critical. Unfortunately, not all network traffics are legitimate and as a matter of fact, a lot of malicious traffic is passed through the Internet all the time. Such malicious DNS traffic can lead to various crimes and possibly exhaust a considerable amount of network bandwidth and resources. Therefore consideration must be given to possible scenarios that might impair DNS. Threats to the operation of the network may come in several forms including Internet bots as disclosed in U.S. Patent Publication No. US 2008/0025328 of Alberts (“Alberts”), the disclosure of which is incorporated herein by reference in its entirety. Alberts discloses enabling an end-user using an IP based network to on-line select and communicate with another end-user without revealing their identity. The selection of an end-user is performed by an Internet bot that is capable of accessing a profile list such that, during a phase in which information is transferred between both end-user, the identity of at least one end-user is not known to the other end-user because information is first transferred to the Internet bot and then from the Internet bot to the other end-user. Another scenario is described in U.S. Patent Publication No. US 2008/0155694 of Kwon et al. (“Kwon”), the disclosure of which is also incorporated herein by reference in its entirety. Kwon discloses a method for dealing with attacks of malicious BOTs, software for performing or controlling a predetermined operation by a specific event or a specific command as a script code having various functions including a remote function for specific objects. When a malicious BOT attacks a specific network or system, it generates more data than the capacity of the target network or system so as to disable the normal service. Kwon discloses addressing malicious BOTs by detecting and analyzing a domain name receiving excessive DNS queries to judge the infection of a malicious BOT, registering the corresponding domain name as normal or abnormal management target, and redirecting an abnormal DNS query for the abnormal management target to a redirection processing and response system.