Within telecommunication networks it is desirable to separate out the signalling level from the bearer or traffic level in order to facilitate interoperability of components supplied by different vendors as well as the interoperability of different network technologies. To this end, the Internet Engineering Task Force (IETF) has specified the protocol known as H.248 (H.248 v2: draft-ietf-megaco-h248v2-04.txt) which is a signalling protocol used between an access node (or Media Gateway, MG) and a controller node (or Media Gateway Controller, MGC). H.248 is used amongst other things for controlling the setup of a call.
Telecommunication networks must be engineered to withstand not only normal load conditions, but also expected and unexpected overload conditions. Such overloads may occur, for example, as a result of a media initiated televoting process or a disaster, e.g. an earthquake. Overloads are typically handled by rejecting call set-up requests at an appropriate network node. However, this in itself may not be sufficient as even the offering of set-up requests to a node may cause that node to fail (and restart).
Consider a NGN comprising a Media Gateway Controller (MGC). In a typical network architecture, the MGC is likely to be connected to thousands of Multi Service Access Nodes (MSANs) which provide interfaces between the Public Switched Telephone Network (PSTN) and IP networks. It can be envisaged that a sufficiently large step change in the load offered to the MGC is likely to cause the MGC to become grossly overloaded to a level where its own internal overload protection mechanism may not provide satisfactory protection.
The ETSI draft [ETSI ES 2XX XXX V<0.0.5> (2006-06), Access Gateway—Media Gateway Controller Rate Based Overload Control] known as “ETSI_NR” proposes an overload control mechanism between the MGCs and the MSANs to protect the MGCs from becoming overloaded during the previously described mass call events. ETSI_NR proposes employing an overflow restrictor at the AGWs to throttle originating (PSTN) call attempts towards the MGCs. A so-called LoadLevel supervision function is implemented in the MGC which periodically measures its load state. If the LoadLevel reaches a critical value, the MGC initiates an originating call restriction mechanism at the MSANs. During periods of overload, the MGC periodically calculates a GlobalLeakRate based on the current LoadLevel. This GlobalLeakRate is then distributed among the AGWs based on their associated predefined weights (wi). The parameter wi may be set according to the number of “lines” terminated by a MSAN. A new leak rate value (notrat=wi. GlobalLeakRate) is sent to the gateway in a subsequent H.248 MODIFY command. The initial value of the GlobalLeakRate, which is used when the overload is first detected at the MGC, is a configuration parameter called InitGlobalLeakRate. It is set to a sufficiently low value to immediately relieve congestion at the MGC, and the calculated GlobalLeakRate is expected to adapt upwards gradually to ensure high utilization of the MGC.
In the event of an overload at the MGC, a MSAN will set its leak rate for incoming call attempts to the rate received from the MGC. The MSAN maintains a buffer of fixed size for receiving and queuing incoming call attempts. According to ETSI_NR, the MSAN will drain call attempts from the front of the buffer at the assigned leak rate. If there is no room for an incoming call attempt at the back of the buffer, that attempt is rejected by the MSAN.
A problem with the ETSI_NR approach is that imbalances in the distribution of incoming call attempts at the MSANs may render the overload control process ineffective. Consider for example a first group of MSANs which are receiving a very high volume of call attempts and a second group of MSANs which are receiving only a very low volume of call attempts. An overload condition at the MGC will be due to the first group of MSANs. The response of the MGC will be to apply the same GlobalLeakRate to all MSANs, temporarily removing the overload. If the traffic demand suddenly increases in the area served by the second group of MSANs, the MSANs of the second group are immediately allowed to offer up calls to the MGC at the rate previously set by the MGW (i.e. notrat). Assuming that the call volumes handled by the first group of MSANs has not fallen significantly, the MGC will return to an overload condition.
This scenario is illustrated in FIG. 1, where the MGC is assumed to be a Telephony Server (TeS) node and there are four MSANs connected to it. Each MSAN has an equal weighting wi as each terminates the same number of subscriber lines. When the group of nodes designated “Group A” (comprising MSAN 1 and MSAN 2) offers a higher calling rates than the capacity of the TeS, the TeS will detect an overload, set the GlobalLeakRate to the InitGlobalLeakRate, and send ¼ of this GloballeakRate to each of the four MSANs. The TeS gradually increases the GlobalLeakRate to increase its utilization, and it continues this process until the total incoming rate from the MSANs reaches C, the processing capacity of the TeS. Assuming that the MSANs of “Group B” are offering only a very low volume of call attempts, GlobalLeakRate will increase further until it reaches 2C, at which point the leak rate (notrat) offered to each MSAN is equal to almost C/2. If the MSANs of group B start to receive a higher volume of call attempts, these will be passed on to the TeS as these nodes are allocated a leak rate of C/2 resulting in a return to the overload condition.
Another potential problem with the ETSI_NR process relates to the speed at which it can adapt to the changes in the offered rate. Assuming that all of the AGWs (e.g. TeS in FIG. 1) are equally weighted, and a group ‘A’ contains m nodes whilst a group ‘B’ contains n nodes, then each weight will be set to wi=1/(m+n). Assume again that the call processing capacity of the MGC is ‘C’. If the overload condition (i.e. with the AGWs of group A providing most of the load) persists for long enough, the GlobalLeakRate (G) of the MGC may settle to approximately GaC*(m+n)/m. Each AGW that is offering calls to the MGC will receive a leak rate of G*wi=C/m. The higher the fraction (m+n)/m>=1 is, the higher the adaptation point Ga will be relative to the call processing capacity C. In the case of a severely focussed load, for example with almost all of the load being supplied by only 10% of the AGWs, the GlobalLeakRate could rise to ten times the call processing capacity. The very large potential range of GlobalLeakRate values will make it difficult for the MGC to adapt smoothly to changes in the load offered to it.
In addition, the ETSI-NR proposal fails to tackle the problem of how to properly terminate the overload control process. Since call admission control is not performed on the MGC, the MGC does not know when calculating the leak rate whether the MSANs are still restricting traffic, or the overload event has ceased. ETSI_NR suggests simply using a timer, ‘TerminationPendingTime’. The timer is started when the measured LoadLevel at the MGC falls below the GoalLoadLevel. If the measured LoadLevel does not go above the GoalLoadLevel while the timer is running, the control process will be switched off upon its expiry. However, a LoadLevel below the GoalLoadLevel does not necessarily mean that the overload condition has ceased, as it is possible that the control process is over-restricting the flow of call attempts through the MSANs. If the control process terminates whilst the leakrate is still adapting upwards and the overload condition remains, the MGC will soon be overloaded again. The control process will be switched back on with GlobalLeakRate set to IntialGlobalLeakRate, possibly leading to an on-off oscillation of the control process and under utilization of the MGC.
An alternative to ETSI_NR is proposed in the ETSI draft [ETSI ES 2XX XXX V<0.0.1> (2005-mm), Access Gateway—Media Gateway Controller Overload Control] and is referred to as ETSI_NB. ETSI_NB proposes that a percentage based restriction should be used to decrease the intensity of originating call attempts at each MSAN when the MGC is in overload. The LoadLevel supervision function works the same way as in the case of ETSI_NR, but instead of calculating GlobalLeakRate, it calculates a percentage restriction factor and sends this to the MSANs. A MSAN allows through to the MGC a proportion of the incoming call attempts in line with the percentage restriction factor received from the MGC.
The main disadvantage of the ETSI_NB control process is that a step change in the level of call attempts arriving at a MSAN, when the MSAN is already applying a percentage restriction factor, will still result in a step change in the call attempts forwarded to the MGC, possibly returning the MGC to overload. Indeed, ETSI_NR was introduced to overcome the dependence of the overload control process of ETSI_NB upon external changes.