World Wide Web (WWW) is a multimedia information retrieval system on the Internet. It is the most common way to transfer data over Internet. Some other means include FTP (File Transfer Protocol), Gopher, etc. On the web, clients can achieve transactions on servers by HTTP (Hypertext Transfer Protocol), and HTTP is a well-known application protocol. This protocol allows clients to use standard HTML (Hypertext Markup Language) pages to access all kinds of files (Text, Image, Sound, Video and etc). HTML files provide the fundamental file format and enable developers define links, which link to other server sites. Under Internet circumstance, we can use URL (Uniform Resource Locator) to define a certain servers address or even the network path. URL has a special syntax to define the network path.
A typical URL includes http followed by www.yourcompany.com/path, where “your_company” is the host server name, “path” is the directory, in which page can be found. A Name Server can translate a URL into an IP address. A Name Server on Internet is called DNS (Domain Name Server). The process by which web clients ask DNS to translate a host name to an IP address is called resolution. In TCP/IP, the Name Server will translate the Host Name into one or several IP address lists. The IP list will be sent back to those clients who ask HTTP requests. Each IP address locates a server, this server will process the request sent by a web client using a web browser.
WWW adopts HTML and follows Client/Server architecture. HTTP service client uses web browsers, which can send all kinds of requests to the server and display the HTML files (sent back from the server) on the screen.
With thousands of companies, universities, and government organizations posting their own Home Pages on the Internet, the Internet becomes a very precious information resource. Even a new user with only a little practice can visit millions of pages and thousands of new groups Internet accesses and the related markets are developing fast too.
In order to provide a high performance service and support for more concurrent users, some big companies setup several mirror servers. All these servers are deployed in different regions or even different countries. Each server has its unique network path (URL) but provides the same service functions.
But, the deployment of the server is always determined by experiences and cannot reflect the real access pattern. If the regions are not select wisely, overload costs will increase inevitably.
Even worse, most users choose a site from a list of mirror sites randomly. The most common way is to select the nearest mirror. But, the complicated situations in the network cannot make sure the nearest one is the fastest one.
For example, if a user wanted to download certain software from Internet, he or she would get a list of server sites. Each server in this list, such as www.download.com, www.microsoft.com and www.linux.org, could provide this software. In most cases, a user wants to select the fastest one, by which he could get what he wants in the minimum time. Unfortunately, most users are not network specialists, or they don't have enough network tools. So, most of them will select one of them randomly. Another possible situation is that some users will select the nearest site by location. They assume that the nearest site should have the shortest response time. Unfortunately again, the network speed to some site is determined by the workload of the server, the topology of the network, and some other more important issues. As users cannot take the real load of a server into account before their selection, different mirrors may have different work statuses. The workload is not well balanced among these servers. At worst, if a user selected a server with a heavy load already but with the nearest location, he might have to spend a longer time to download the software than he wanted.
Due to the consequences of deployment and blindfold selection, the load among mirror sites is not balanced. So the overall performance of the Internet is decreased. For reasons above, it is a very critical problem to balance the load among the mirrors.
As we know, the current load balance methods only deal with the LAN, and all these methods only work on the server side. In order to make the balance transparent to end-users, all these methods must be devised carefully. Due to these limitations, all these methods which have been designed for the LAN cannot be used on Internet directly and easily.
The first object of this invention is to provide a method, which can balance the load among mirrors with clients' active participation. The method just needs a few modifications on clients.
The second object of this invention is to provide an apparatus, which can balance the servers' load and this apparatus can be easily installed into clients.