The present invention relates generally to estimating the size of computer networks and, more particularly, to estimating and forecasting growth in the number of Internet hosts registered in the Internet Domain Name Service (DNS).
Every computer with a permanent connection to the Internet is identified by an Internet Protocol (IP) address. As shown in FIG. 1, examples of computers with IP addresses are host computer 10, server 20, workstation 30 (connected to Internet Backbone 50 through LAN 35), and Internet Service Provider (ISP) 40. An IP address comprises four parts, e.g., "a.b.c.d," which corresponds to the address of host 10 in FIG. 1. Each part is an integer between 0 and 255, so that there are 2.sup.8 =256 possible values for each of the four parts, and each part can be represented using 8 bits. Therefore, there are 2.sup.32 possible IP addresses. Each part of the address becomes increasingly machine-specific. For example, the first part specifies a geographic region, the second part specifies a service provider or organization such as a university, the third part specifies a group of computers, and the fourth part specifies the machine itself.
In addition to host computers, other pieces of computer equipment have their own IP addresses, e.g., server 20 has address efg.h, and workstation 30 has address i.j.k.l. Individual computers such as PCs, however, do not have permanent IP addresses. They only receive an IP address temporarily by connecting to an ISP such as ISP 40, which has an IP address for each modem in its modem bank.
Every computer on the Internet also has an alphanumeric name, referred to as a domain name, and the Domain Name System (DNS) contains, among other information, mappings between IP addresses and domain names. The DNS is a distributed database held by systems running name server software. There is a hierarchy of DNS servers, with servers at the lowest level containing name-to-address mappings for a group of hosts, and servers at increasingly higher levels containing data for larger groups of hosts. At the top level are root name servers that hold all data for the top-level domains, e.g., ".com," ".org," and geographical domains such as ".uk" and ".jp." Whenever a user on a local computer enters a domain name, the local computer contacts a DNS server, possibly on the local computer itself. If the first DNS server cannot resolve the domain name by finding its IP address, the DNS server contacts an authoritative server higher up the hierarchy. That server, in turn, contacts an even higher-level DNS server if it cannot resolve the domain name.
For reverse mapping from IP addresses to domain names, there is a pseudo-domain called IN-ADDR.ARPA. This domain contains exactly one PTR (pointer) record for each IP address. Because the highest order of significance in the naming system is on the right, the notation for addresses is reversed in IN-ADDR.ARPA. For example, the IP address 120.76.108.14 would have a reverse domain entry of 14.108.76.120.IN-ADDR.ARPA.
The size of the Internet can be determined based only on one directly measurable quantity--the number of computers registered in DNS. One method for determining the number of computers registered is to perform an exhaustive count using DNS zone transfers. In a DNS zone transfer, a DNS server requests a download of information from another server in the same zone, or portion of the domain space. Usually a server requests a download from a server that is higher in the hierarchy. By requesting DNS zone transfers throughout the Internet, one server can actually count the number of computers registered. One problem with this approach, however, is that as the Internet grows, exhaustive surveys take longer and longer to perform, so that results for one month may not be available until at least the following month. A second problem with this approach is that zone transfers put a heavy load on servers and are considered to be intrusive. As a result, many servers have banned zone transfers, causing estimates of Internet size based on DNS zone transfers to become less and less accurate.
Furthermore, current methodologies for estimating the size of the Internet provide only historical measurements. They do not provide a forecast of future Internet growth.
It is desirable, therefore, to provide a method for accurately estimating the number of registered hosts on the Internet. It is also desirable to provide a method for forecasting the growth in the number of registered hosts on the Internet. It is also desirable to provide a method for estimating and forecasting the size of the Internet by arbitrary segment, e.g., based on the top level or second level domain name. It is even more desirable to provide a method that quickly and accurately estimates current Internet size and does not underestimate the number of hosts.