The invention relates generally to traffic regulation in a communications network. More particularly, the invention relates to regulating the flow of traffic to message processors by adaptive rate control based on message queuing delay.
Telecommunication networks typically operate in real time. Operating in real time imposes strict requirements on the permissible boundaries of end to end message delays. However, as the network traffic load increases so do the delays at network nodes. These nodal delays can result in network traffic slowing to the point that messages can no longer be transmitted across the network without exceeding the permissible boundaries for message delays. When this occurs, messages are discarded and must be retransmitted. Unfortunately, unless the load decreases or the message handling capability of the network increases, the retransmitted messages simply add to the load on the already overloaded network with potentially catastrophic effects.
This catastrophe is readily averted by simply slowing down the number of incoming messages at the source. In the public switched telephone network, for example, a caller will hear a busy signal or a recorded message stating all lines are busy and requesting that the call be placed at a later time. Accordingly, the network load is regulated at the source by restricting access to the network.
However, regulating access to the network is often impractical and is difficult, if not impossible, to effectively coordinate with the processing capabilities of various network elements. But the load on a particular network element can be regulated more flexibly and with greater responsiveness. Accordingly, overload is prevented or reduced by regulating the load at network elements (i.e., message processors). A typical procedure for regulating a message processor load is fixing the maximum number of messages that can be queued at the processor by establishing a finite input buffer. Any incoming messages received after the input buffer fills will be discarded, but any accepted messages will be processed before timing out.
Network management overload controls sent from a message processor to a source controller are typically used to prevent messages from being timed out. An overloaded message processor will send a control message to a source controller controlling the load on the message processor requesting a reduction in the offered load. The source controller can then divert the incoming messages to another message processor or, in turn, send a message upstream requesting a reduction in the offered load applied to the source controller. In circumstances where the message processor service rate is static, a fixed buffer threshold can generally provide acceptable performance.
However, variable service rates may exist in a network element for a variety of reasons. For example, if the network element is a multiprocessor system, then the service rate may vary as processors are brought in and out of operation or processors and other components are upgraded (or downgraded). Alternatively, a single network element may, for example, provide several services with different processing rates, resulting in varying message processing times.
Unfortunately, a dynamically varying service rate requires a concurrently varying buffer threshold to avoid having a buffer which is either too long (in which case accepted messages will time out before being processed) or too short (in which case messages will be turned away which could have been accepted and processed). Either result is generally considered undesirable in an efficiently operating network. Therefore, a predetermined buffer threshold is not acceptable in a network element that has variable service rates, and there is a need for a flexible system capable of responding to dynamic variations.
In view of the foregoing, there is a need for a system which can accommodate stochastic time-varying service rates and traffic loads while providing for efficient resource utilization. Such a system maintains the message queuing delay for each processor at or below the threshold value at which a message will time out and be discarded. If the message queuing delay is too far below the threshold value then, because of normal fluctuations in the offered load, the queue may become empty, in which case the processor is not fully utilized, possibly causing other message processors in the network to be overloaded or requiring augmentation of the network with additional message processors that would not be needed if every processor were optimally queued. On the other hand, if the message queuing delay is too close to the threshold value, then random variations in message processing will result in accepted messages timing out because the actual delay will exceed the threshold.
The present invention achieves this by dynamically varying the number of messages queued for the message processor rather than relying upon a single buffer threshold as in prior art systems. Buffer overflow is not considered to be a significant restraint in the present invention as a physically large buffer is contemplated. Use of large buffers is considered highly practicable in view of the marked trends toward lower cost and higher density in commercially available memory systems. In fact, at least with respect to contemporary processing speeds, the buffer may be regarded as essentially infinite in so much as the message holding capacity of a commercially practicable system far exceeds the limits imposed by the message timeout boundaries typically encountered in telecommunications and data processing networks.
Although buffer overflow is not considered a significant constraint, keeping enough messages queued so the message processor is neither idled (i.e., no messages queued) nor overloaded (i.e., so many messages queued that messages time out before being processed) are considered significant constraints. Accordingly, the present invention maintains the average queuing delay at a specified value by varying the controlled load applied to the message processor. When the average queuing delay drops below the specified value, the controlled load is increased accordingly, resulting in more messages being queued and processed. Similarly, when the average queuing delays rises above a specified value, the controlled load is decreased, resulting in a decrease in the average queuing delay.
By dynamically varying the number of messages queued in response to the average message queuing delay rather than simply queuing a predetermined number of messages as in the prior art, the present invention is able to dynamically respond to changes in either the service rate (e.g., the rate at which messages are serviced by the processor) or the load rate (i.e., the rate at which messages are queued). As discussed below, the average message delay may be determined several different ways. Once the average message delay is determined, it can be used to detect and control overload in a network element, advantageously reducing congestion and increasing processing optimization in the network.
In a preferred embodiment of the present invention, the messages being processed are database queries such as those processed in a telecommunications network. For example, these database queries are of the type processed by directory information databases in a typical telecommunications network switching system whenever a call is processed. These database queries are related to the service(s) provided, and may, therefore, involve differing processing times depending on the specific service and the database query.
Referring to FIG. 1, a portion of a network switching system such as may be used in a public switched telephone network is illustrated. A connecting network 1 provides a plurality of links between a plurality of access nodes 2-2xe2x80x2 and one or more database nodes 3. Each access node 2 may be, for example, a local exchange carrier (LEC) which itself interfaces with a plurality of telephone terminals (not shown). When a telephone call is placed, a database 3 is queried to identify the processing required for each call in accordance with the dialed telephone number and/or the originating number. A response must be received from the database 3 before the call can be processed.
However, because of the variety of services offered and other factors, the processing time for each call is variable. Accordingly, it is not possible to precisely forecast the maximum rate of database query responses based on a fixed parameter and thereby limit the number of calls in the system to prevent calls from timing out when the database 3 cannot respond to a query in the allotted time period. Moreover, databases are frequently replicated throughout the telephone system network in order to increase the number of calls which can be processed concurrently. For example, if a single database server can respond to 7500 queries in a second and there are 6 replicated database servers in the system, then the system has a processing capacity of 45,000 queries per second, assuming a fixed processing time for each query. However, the number of fully functional database servers may vary dynamically (e.g., a database server could fail or performance could be degraded) or the processing time for each query may be variable. Under these circumstances, it becomes necessary to regulate the number of incoming calls to prevent calls from timing out without being properly connected. As discussed below, a variety of distinct control methods may be utilized to regulate the number of incoming calls received from one or more of the access nodes or network inputs. These control methods may be advantageously combined with the message queuing delay technique for detecting processor overload to maintain network efficiency in response to varying load and/or varying processing rates.