1. Field of the Invention
The present invention relates generally to data networks. More specifically, the present invention relates to protocols for transferring electronic information across data networks.
2. Background
The ability to effectively transfer electronic information over a data network is often highly dependent on the amount of resources available to the host that provides the information. For example, in the area of Internet content delivery, the ability of a network server to provide data (such as files, e-mail and streaming media) to a large number of clients can be affected by the resource limitations of the host on which the server resides. Such resource limitations typically include, but are not limited to, network bandwidth, host processing power or CPU, available memory, file and/or socket descriptors, and disk input/output bandwidth. As the number of clients requesting files or other content from a particular server increases, one or more of these resources may eventually become exhausted.
It has been observed that host machines that provide content over the Internet often experience short but critical xe2x80x9cpeakxe2x80x9d usage periods after new content is released during which a large number of users simultaneously attempt to download content from the host. Often, there is no way to make enough resources available to the host to service the numerous data requests received during those peak periods. Accordingly, it is desirable that the host be able to with resource exhaustion in a manner that is as pleasing as possible to network users requesting content.
Conventional network servers typically exhibit one or more of the following behaviors as their resources near exhaustion: (1) additional clients are completely denied access, either with a standard error message/protocol or simply through the inability to complete a transaction; (2) additional clients are served, but the performance for all clients accessing the host degrades, often creating a xe2x80x9csnowballxe2x80x9d effect whereby performance eventually reaches zero for all clients as resources are completely exhausted; and/or (3) the host is shut down or xe2x80x9ccrashesxe2x80x9d because a key resource is exhausted. These behaviors are undesirable for client users as well as for hosts, and often impair the ability of any user to receive content during peak demand periods. This impairment is exacerbated by the fact that conventional clients, such as conventional Web browsers, are programmed to continually retry their requests, which simply puts additional strain on the network server and can lead to further resource exhaustion.
Additionally, as host resources are exhausted, the allocation of resources for servicing client requests by conventional servers essentially becomes random. In particular, when a host""s resources are exhausted, clients are denied access to the host. As mentioned above, when clients are denied access to the host, they will often continually retransmit their requests. As host resources become available, a conventional server will provide them to the first client request that is received, irrespective of how long the client that made the request has been attempting to access the host. As a result, resource allocation becomes random and clients have no guarantee when they will be able to complete their request. In fact, clients accessing the host for the first time may receive requested data prior to clients that have been requesting service for a longer period of time. This randomness can lead to frustration on the part of users requesting content.
Finally, conventional network servers are typically incapable of determining the number of users that have attempted to access the host for content but have failed due to resource exhaustion. Although the exhibition of one or more of the above-described behaviors may inform a system administrator that the resources of the host are at or near exhaustion, those behaviors do not provide a system administrator with any indication of just how many clients are trying but failing to access the host at any given point in time. Such information may be useful for gauging necessary host resource levels and performing load balancing functions between multiple network servers.
What is desired, then, is a system and method for transferring data over a network that permits a host or system administrator to monitor and limit host resources such that numerous client data requests may be serviced without completely exhausting one or more host resources. The desired system and method should also handle resource limitations in a way that is as pleasing as possible to users that are waiting for requested data. Furthermore, the desired system and method should permit a host or system administrator to determine the number of clients that are currently waiting for data from the host.
The present invention is directed to a queuing system, method and computer program product for transferring data over a network. In embodiments of the present invention, a host receives a request for data, such as a file of digital information, from a client and determines if sufficient resources are available to service the request. If sufficient resources are available to service the request, the host provides the data to the client. However, if sufficient resources are not available to service the request, the host sends a message regarding a queue to the client, receives a request to enter the queue from the client, places the client in the queue in response to receiving the request to enter the queue, and provides the requested data to the client when the client reaches the front of the queue and sufficient resources are available to service the request.
In further embodiments of the present invention, a host or a system administrator ascertains the availability of at least one host resource, such as network bandwidth, processing power, memory, file descriptors, socket descriptors, or disk input/output bandwidth, and sets a resource limit based on the availability of the at least one host resource. The host then determines if sufficient resources are available to service a client data request by determining if the resource limit has been reached. In embodiments, the resource limit is a limit on the total number of client connections for downloading data.
In still further embodiments of the present invention, the host provides queue information to the client requesting data, either before the client has entered the queue or as part of periodic messages to the client while the client waits in the queue. The queue information may include a length of the queue, an anticipated or current position of the client in the queue, and/or an estimated wait time in the queue.
The invention is advantageous in that it permits a host or system administrator to monitor and limit host resources such that numerous client data requests may be serviced without completely exhausting one or more host resources.
The invention is also advantageous in that it permits host resource limitations to be handled in a way that is as pleasing as possible to users that are waiting for requested data.
The invention is further advantageous in that it permits a host or system administrator to accurately determine the number of clients that are currently waiting for data from the host.
Additional features and advantages of the invention will be set forth in the description that follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the system and method particularly pointed out in the written description and claims hereof as well as the appended drawings.