This invention relates to a method of grouping or clustering clients, servers and/or other entities within a network to optimize and expedite the flow, transfer, redirection and/or redistribution of data and information within the network and more particularly, to a method for fast network aware or on-line clustering which uses a radix-encoded trie process to perform longest prefix matching on one or more client and/or server network IP addresses in order to properly cluster the clients and/or server into proper clusters.
Servers, such as proxy servers, cache servers, content distribution servers, mirror servers and other related servers are typically used to speed the access of data and reduce response time for network client requests in a network, such as the World Wide Web. Generally, these network clients issue requests for information, such as in the form of a Hypertext Transfer Protocol (HTTP) requests for some information, such as one or more Web pages. These requests are then handled directly or indirectly by these servers, such as proxy servers, caches servers, content distribution servers and mirror servers, to hopefully expedite the accessing and transfer of the requested information.
Generally, these servers either act as intermediaries or as transfer or redirection points for client requests in the network. For example, in operation, a proxy server receives a request for an Internet service (such as a Web page request) from a user. If the request passes filtering requirements, the proxy server looks in its local cache of previously downloaded Web pages. If the server finds the page, the page is returned to the user without needing to forward the request to, for example, a World Wide Web server on the Internet. If the page is not in the cache, the proxy server, acting as a client on behalf of the user, requests the page from the server out on the Internet. When the page is returned, the proxy server relates it to the original request and forwards it on to the client user.
Strategically designing placement of proxies in the network can benefit greatly from clustering network client users who are from the same network together so that the proxy server can adequately and efficiently serve these respective client clusters. Mis-characterizing clients as being in the same network may result in a proxy server being placed such that it impracticably and inefficiently serves these clients resulting in degraded performances in the network.
In the case of, for example, a cache or a content distribution server, the user's HTTP request at an originating server is typically re-routed away from the originating server and on to a cache server “closer” to the user. Generally, the cache server determines what content in the request exists in the cache, serves that content, and retrieves any non-cached content from the originating server. Any new content may also be cached locally in the cache server.
Similar to the strategic placement of proxies, the placement of cache servers, content distribution “boxes” or servers and related mirror servers can be best made by accurately clustering clients together in the network. Performances in the network may thus be improved by accurately and properly clustering multiple network clients together in related client clusters. The servers, whether they are cache servers, content distribution servers and/or mirror servers can then efficiently service these client clusters.
Knowledge of these network clusters, such as identifying certain “busy” clusters from which a certain level of network traffic originates can be used in a variety of different applications. For example, a busy Web site may want to provide tailored responses and/or Quality of Service differentiation based on the origin of requests to the Web site. Web sites and/or server may also be able to dynamically perform automatic user request re-direction where needed in the network based on clustering information. However, such information needs to captured in an efficient, expedited and real-time basis without any undue lag time which may be experienced by the Web site requester.
Accordingly, it would be desirable to have a method for accurately clustering clients, servers and other entities within a network together to guide placement of proxies, cache servers, content distribution servers and mirror servers within the network. It would also be desirable to have a method for fast on-line clustering which may be used in applications such as content distribution, proxy positioning, server replication and network management.