1. Field of the Invention
This invention relates to a method of controlling a communications network which includes at least one local exchange connected by routes to one or more parent trunk exchanges each being one of a plurality of trunk exchanges interconnected by trunk routes. It is particularly concerned with the detection of local exchange failures in such networks.
2. Related Art
Commonly a local exchange is associated with a home exchange, through which incoming calls to the local exchange are routed, and a security exchange, through which outward calls from the local exchange are routed, in normal operation. The security exchange is so called because it can also be used to route incoming calls to the local exchange if the home exchange fails. The security and home exchanges are collectively referred to as the parent exchanges of the local exchange. Exchanges other than the parent exchanges are referred to as far end trunk exchanges of the local exchanges.
Near real-time network traffic management (NTM) is an essential component of network management if optimal traffic performance in terms of call throughput is to be ensured. To give an indication of the volume of traffic which may be involved, BT's trunk network in the United Kingdom currently handles approximately six million call attempts per hour during the busy periods which is equivalent to 1,700 call attempts per second. Given such a volume of traffic it is essential that any network difficulties are detected and controlled as quickly as possible. For example, difficulties are often encountered by network traffic managers due to abnormal traffic patterns which can be caused by events such as phone-ins, tele-votes and public holidays (for example Christmas Day and New Year's Eve/Day). In all these cases traffic in the network varies widely from the normal level, sometimes quite spectacularly, and the network must be controlled to maintain the best overall network performance.
With the introduction of digital switches such as System X it is possible to monitor closely the performance of each exchange and the routes between them and to the subscribers. BT's Network Traffic Management System (NTMS) currently receives statistics on upwards of 37,000 routes from 490 exchanges in the UK every five minutes, which measurement period was chosen to be a long enough period to be able to obtain a statistically reliable measurement of the network performance whilst being short enough to allow effective real-time control of the network.
The information received by the NTMS is processed to provide CCITT recommended parameters. For instance, these include the Percentage Overflow (OFL) and All Circuits Engaged (ACE) parameters. The parameter values are then compared with thresholds to determine if any difficulties exist on the monitored network elements.
Usually the first indication of a network problem is when an `exception` is displayed on a wall-board, or on a graphical interface at an individual manager's workstation, at a Traffic Management Centre. Exceptions are those parameter values, calculated from network element measurements, which deviate sufficiently from a predetermined threshold for that value. The exceptions are ranked in a priority order with the top 20 displayed. However, due to the manner in which the thresholds are set by the network traffic managers, some exceptions do not necessarily indicate a difficulty as thresholds are percentage-based and set a value which ensures all potential difficulties are captured. This results in exceptions being displayed that are occasionally spurious or insignificant. The exceptions therefore need to be examined in more detail to determine if a real difficulty exists and whether it warrants any action. To help in this activity several information sources are currently used by the network traffic managers.
The NTMS provides near real-time surveillance and monitoring of the network's status and performance. It provides the network traffic managers with information to enable them take prompt action to control the flow of traffic to ensure the maximum utilization of the network in all situations. The NTMS allows network traffic managers to look at the raw statistics as well as derived generic parameters and to compare traffic patterns over the last few measurement periods to isolate any trends.
An On-Line Traffic Information System (OTIS) takes the measurement of statistics from the NTMS system and processes them to provide summarised historical data for daily and weekly traffic patterns. This system allows the network traffic managers to examine historical traffic patterns to detect any radical shifts in traffic.
A data management system provides the network traffic managers with an up-to-date copy of the routing tables at all trunk exchanges. This information is used to check the routes to which calls can be routed, which controls are in force and the routing algorithms being used.
There is also a broadcast speaker facility which connects the world-wide network management centre to all the regional centres.
Once a potential difficulty has been detected, acknowledged and analyzed, it is characterised and a decision made whether to control it using the available range of expansive and restrictive controls to either allow alternative traffic paths through the network or to restrict or block call attempts to particular areas, respectively. The situation must then be monitored to ensure the controls are having the desired effect and that they are removed as soon as a problem has been dealt with effectively.
One class of exception associated with telecommunications networks is the failure of a local exchange.
Although local exchange failures occur relatively frequently they rarely result in a problem that requires intervention from the network traffic managers. This is because of the unit's built-in self-correcting mechanisms.
For example, if a problem occurs at a System X exchange there are a number of stages it will go through to try and recover. These are:
a) Process Rollback--this is a software routine and service is not affected. A Rollback shows on the NTMS as an exchange alarm;
b) Restart--the exchange automatically restarts and service is affected;
c) System Initialisation--the software is initialised from a backing store; and
d) Manual Reload--part or all of the system is reloaded manually.
When a unit is in trouble it will first try four or five Rollbacks and only if these are unsuccessful in curing the problem will it perform an automatic Restart. If a Restart occurs this can be detected from NTMS statistics.
Normally a Restart is sufficient to return the unit to a fully working condition. However, sometimes when the unit returns it still does not perform correctly so it needs to be monitored to ensure that it is handling calls correctly. The last two stages, c) and d), occur only rarely when a Restart fails.
In the majority of cases no action is therefore necessary. However, when it is, a control such as route gapping may be used but it is present practice only to apply route gapping if the exchange is likely to be isolated for another five minutes and the calling levels are high.
Controls available might comprise not only route gapping but also other forms of call gapping, and code blocking. Route gapping however affects all calls down a particular route. Call gapping and code blocking can be applied to be more destination specific.
When an exchange is in difficulty the first function it stops is the production of performance statistics. (In System X exchanges these are known as MSS statistics, from the Management Statistics Subsystem.) In some cases this means the statistics from the affected exchange are all zero even when it is in fact handling calls correctly. In such cases it is therefore necessary to monitor the network to determine local exchange failures other than by looking at the parameters issued by the local exchanges. To do this successfully it is necessary to monitor selected parameters which change their value in a manner distinctive of such a local exchange failure.