The present invention relates to the field of computer network servers. More particularly, the present invention relates to a method for determining and balancing loads among a plurality of servers.
In a conventional cluster of chat servers, the load presented by users connecting to new and existing channels is unpredictable. Consequently, some of the chat servers of the cluster may become overloaded, while other chat servers may be significantly underutilized.
What is needed is a technique for predicting the load presented by new users connecting to new and existing channels of a server cluster and using the predicted load for balancing and distributing users among the servers of the server cluster.
The present invention provides a technique for predicting the load presented by new users connecting to new and existing channels of a server cluster and using the predicted load for balancing and distributing users among the servers of the server cluster. The present invention uses a measure of past and current load patterns to assign channels in a balanced manner among the servers of a chat server cluster.
The advantages of the present invention are provided by a method for determining a load distribution for a plurality of servers. A total user count at periodic time-slices is received from each server of a plurality of servers. A load gradient is then calculated for a predetermined interval of time for each server of the plurality of servers. A present load distribution is determined for the predetermined interval of time for each respective server of the plurality of servers based on the total user count received from each server. A future load distribution is determined for each respective channel resource for each respective server based on the total user count for each server and the respective load gradient. Lastly, new channel resources are distributed among the plurality of servers based on the determined future load distribution for each server. According to the invention, the predetermined interval of time is a sliding window of time having a predetermined number of timeslots each having a predetermined timeslot interval. Preferably, the sliding window of time spans about 60 seconds and the predetermined timeslot interval is about 5 seconds. Additionally, a load gradient for a server is based on a difference between the total number of users connected to the server at the end of the predetermined interval of time and the total number of users connected to the server at the beginning of the predetermined interval of time. When a new channel resource is created, the new channel is assigned to a selected server of the plurality of servers based on the load distribution associated with each respective server of the plurality of servers. The channel resource is assigned an initial estimated weight that is credited to the current server at the current timeslot.