Over the Internet, a user can access any resource, such as a Hypertext Markup Language (HTML) file, available over the Internet on any host by entering a URL (Uniform Resource Locator) in an Internet browser. For the convenience of users, a resource provider may list all its resources in a site (the resource delivery site), but store the resources in many other sites (the resource providing sites). When a user requests a resource from the resource delivery site, the resource delivery site, generally, checks if there is a valid cached copy. If the cached copy is valid, the resource delivery site delivers the cached copy to the user. Otherwise, the resource delivery site obtains another copy of the resource from the corresponding resource providing site and updates the cached copy. A cached copy is invalid if it becomes unauthorized by the resource providing site. For example, a cached copy is invalid if its existence has exceeded an interval specified by the resource providing site. The advantage of the cache ability is that it enables a faster delivery for the next request for the same resource if the cached copy at the resource delivery site is valid. Since the cached copy may become invalid before the next request has been received, it is desirable to identify frequently used (hot) Internet sites, so that the resource delivery site can obtain a valid cached copy of resources from those Internet sites in advance. In the following, “Internet site name” and “resource name” are used interchangeably because a URL usually specifies both the Internet site name and the resource name. Also, a “resource delivery site” and a “resource providing site” are used to represent the hosts at the respective sites.
One way of identifying frequently used Internet sites is to list all Internet site names received during a given period and count the number of times each Internet site name has been received. Then select those which have a count exceeding a threshold as the most frequently used Internet sites. However, the list may be long and, thus, require a lot of memory space and computing power. Furthermore, if a new Internet site name is received but the memory has already exhausted, the new Internet site name is usually dropped even if that Internet site would be most frequently used.