This invention relates generally to communication systems, and more particularly to the scheduling of job requests to be serviced in a multiple-channel, point-to-point system.
The boom in the Internet and the rise of new network technologies have focused attention on designing faster and more efficient data networks. A key component of the data network is the data server. Data servers are the engines that store and feed content to diverse clients over the network media. Data servers can take many forms, such as infestations, wireless gateways, web servers or specialized servers such as traffic or weather information servers. Increasingly, data servers employ a multiple-channel, point-to-point model.
For example, FIG. 1 illustrates a typical multiple-channel communication system of the prior art, which satisfies job requests in a point-to-point, or unicast, fashion. In a point-to-point system, each client""s request for data is required to be individually satisfied, even if more than one client makes a job request for the same data item. In this regard, a point-to-point system is different from a broadcast system, which is used to advantage by satisfying all of the pending job requests for a data item with a single broadcast of the data item. One of many examples of a point-to-point system is an Internet server which provides web page data in response to requests from different Internet users.
In FIG. 1, devices 30a through 30w make requests for data which are delivered to central server 10 by any method. Specifically, device 30a makes a job request for data item a, device 30b makes a job request for data item b, and device 30w makes a job request for data item w. Central server 10 receives the job requests and retrieves the requested data from either an internal database or an external database (as will be shown below). Central server 10 then transmits the job request to one of a plurality of local channel servers, designated as 40-1 through 40-k. Each local channel server has a corresponding data channel, designated as 44-1 through 44-k. When the job request is received by one of the local channel servers, the local channel server then services the job request by transmitting the requested data to the requesting device via its corresponding channel.
Additionally, a scheduling method is employed by the system. The scheduling system determines the order in which the job requests are serviced. In the prior art, one method which is typically employed for this purpose is a centralized scheduling method. According to a centralized scheduling method, the central server has a corresponding queue and employs a scheduling algorithm in order to determine which of the jobs pending in its queue are sent to the local channel servers to be serviced. However, a centralized scheduling arrangement has a high overhead. Specifically, a large amount of memory is required to store the pending job requests in the central server""s single queue, and the computational complexity of scheduling all of these jobs simultaneously requires very high processing power.
Another method of the prior art which is typically employed for this purpose, and which imposes fewer overhead costs, is a localized scheduling method. According to a localized scheduling method, each local channel server has a corresponding queue, designated as 42-1 through 42-k. When a job request is received by one of the local channel servers, that local channel server stores the job request in its corresponding queue along with other pending job requests which it has received from central server 10 but which have not yet been serviced. Central server 10 determines which local channel server to dispatch the job request to by employing a load balancing algorithm. Although there are various definitions of xe2x80x9cloadxe2x80x9d, generally a central server which employs a load balancing algorithm assigns each new job to the local channel server that has the fewest pending bytes in its queue waiting to be serviced. In another prior art scheme, the central server employs a load balancing algorithm and assigns each new job to the local channel server that has the fewest number of jobs in its queue waiting to be serviced.
Additionally, according to a localized scheduling scheme, each local channel server also employs a scheduling algorithm in order to determine which of the jobs pending in its corresponding queue are to be serviced first. One scheduling algorithm which is typically employed by the local channel servers is a first-in first-out scheme. However, this arrangement may be unsatisfactory, specifically with widely heterogeneous data requests, because a large job request which arrives early may prevent small, later-arriving job requests from being serviced until after the large job request has been completely serviced (heterogeneous data requests are data requests of varying sizes).
Another scheduling algorithm which is typically employed by the local channel servers is the xe2x80x9cShortest Remaining Processing Timexe2x80x9d (hereinafter referred to as xe2x80x9cSRPTxe2x80x9d) algorithm. The SRPT algorithm produces a schedule which minimizes the time it takes to process all of the uncompleted jobs in a queue when there is a single processor that schedules a queue of pending jobs for service via a single channel. The SRPT algorithm is typically employed when jobs arrive in a continuous stream, and is based upon a sum-flow metric. The relevant parameter of the sum-flow metric is the time a job spends in the system. The SRPT algorithm employs the sum-flow metric by summing the time that all jobs spend in the system, and schedules the pending jobs so as to minimize this summed amount of time.
However, the SRPT algorithm has the drawback that it leads to starvation. Starvation occurs when some job request to the server is delayed to an unbounded extent. For instance, starvation may occur when the servicing of a pending job request is continually delayed because the SRPT algorithm determines that incoming job requests are serviced prior to the pending job request. Although the SRPT algorithm can be desirable in some circumstances, the fact that specific job requests are delayed to an unbounded extent is unfair to the person who made the request which is delayed. Furthermore, the fact that some job requests are delayed to an unbounded extent prevents the owner of the server system from being able to make a quality-of-service guarantee to each user that the schedule will be responsive to each job and avoid starvation of any job.
Although the above-referenced scheduling methods have attempted to address data server performance issues, there is an ever increasing need for improved scheduling methods that provide satisfactory performance. Therefore, there exists a need for a system and method for optimally scheduling the transmission of data items in multiple-channel, point-to-point communication system.
In accordance with one embodiment, a system and method is disclosed for scheduling the servicing of job requests in a localized point-to-point communication system having a central server providing job requests to a plurality of local channel servers. In a first step, the method receives a new job request at the central server. A performance of each channel is measured based upon a predetermined performance metric, and the job request is dispatched to one of the channels for servicing thereby dependent upon the performance of each of the channels. In this context, a performance metric is a quantitative measure of data flow through each channel.
In one embodiment, referred to as the minimum flow algorithm, the job request is dispatched by the central server to the local channel server having the lowest current average flow time. Flow is a measurement of the amount of time required to service a job. Thus, flow is equal to the difference between an arrival time of a job request and a completion time of a job request. When a job request is completed by a local channel server, the server calculates the flow for that job, then calculates an average flow for all of the jobs that the server has completed so far.
In another embodiment, referred to as the minimum stretch algorithm, the job request is dispatched by the central server to the channel having the lowest current maximum stretch value. A stretch value is a ratio equal to an amount of time required to service a job request while also serving other uncompleted job requests, divided by an amount of time required to service the same job request if no other job requests were required to be serviced. In other words, a stretch value provides an indication of the delay experienced by each job request to complete, when the server processes many job requests concurrently. Each local channel server maintains a record of its current maximum stretch value in order to schedule the job requests pending in its queue.
Additionally, each local channel server also employs a scheduling algorithm in order to determine which of the jobs pending in its corresponding queue are to be serviced first. According to one embodiment, an SRPT scheduling algorithm is employed by a local channel server, which sums the time that all jobs spend in the system and schedules the pending jobs so as to minimize this summed amount of time. According to another embodiment, a MAX scheduling algorithm is employed by a local channel server, which schedules the pending jobs inits queue so as to minimize the stretch experienced by each job request serviced thereby.
Both the minimum stretch and the minimum flow algorithm of the present invention outperform the prior art load balancing algorithm with respect to the average stretch and the average response time experienced by the job requests in the system.