There are various scenarios in which the end-to-end performance of services delivered across a network may be important, and various ways in which good or poor end-to-end performance may manifest itself and be monitored or perceived by end-users. Various factors may affect end-to-end performance, including the performance of an end-user's own local network or a computing device therein, but unsatisfactory end-to-end performance in respect of a service is often caused by links on the route that carries the service across a network, the choice of which is generally not under the control of the end-user.
If the links making up a route currently being used for a service cannot sustain the service-level agreement (SLA) of the service, end-users of the service are likely to perceive that they have experienced poor performance. Poor performance can manifest itself, for example, as drop-outs in a video service, as poor voice-quality in a Voice-over-. Internet Protocol (VoIP) call, as slow response behaviour in an application running on a remote server, etc. Measuring the SLA parameters (e.g. delay, jitter, packet loss etc.) of a service may not be sufficient to get a good idea about the end-to-end performance. For example, video content with rapid scene changes generally requires a higher bit-rate to achieve the same quality of perception as video content with little movement. Therefore, small fluctuations in Quality of Service (QoS) may not be noticeable to end-users in terms of Quality of Experience (QoE), meaning that a service could still continue on a route that occasionally shows degraded performance. In some scenarios, SLA parameters may be too lenient for good QoE, possibly having been selected mainly out of cost considerations. Real-time collection of per-link performance metrics from the network might be a challenge. In other scenarios, many services may be transmitted over one link, the theoretical link capacity of which should be able to sustain all of the services, but fluctuating data rates may result in them competing with each other for capacity at certain times, leading to a poor end-to-end performance for some or all of them.
One or more links on a route may thus be insufficient for delivering good end-to-end performance of a service. In some scenarios, a network operator may have access to network performance metrics per link in real-time, and may use these, but such real-time “per link” network performance metrics are not always available, and even if they are, they might be spurious or poorly understood.
If routing is done according to standard protocols based simply on minimising hop-count, it may not be possible to pick links that can guarantee an SLA.
A paper entitled “QoE Content Distribution Network for Cloud Architecture” by Hai Anh Tran, Abdelhamid Mellouk and Said Hoceini (First International Symposium on Network Cloud Computing and Applications (NCCA), November 2011) relates to “cloud” services and their increasing use in the provision of network services. Due to the high bandwidth requirements of cloud services, use may be made of Content Distribution Networks (CDNs), which may support high request volume and improve network quality using a mechanism based on replication of information among multiple servers. The paper proposes a Content Distribution Network Cloud Architecture, which based not just on Quality of Service criteria (such as round trip time, network hops, loss rate, etc.) but also on the Quality of Experience that represents end-users perception and satisfaction. It describes how QoE scores may be used in combination with QoS parameters to compute a link score or link cost that can be used in a routing function, describing how QoE values can be sent back along the route data packets have travelled and how link scores may be updates via a known method called “Q-Learning”.
Past techniques for predicting link failures and link QoS degradation have generally required real-time link performance metrics, such as those from a Management Information Base (MIB) of routers, to determine these weak links.
International application WO 2012/085498 relates to communications network management, and in particular to a communications network which is divided into a plurality of segments, each segment comprising one or more routers and one or more communications links that connect the routers. QoS thresholds can be defined for each of the segments, and if it is predicted that one of these thresholds is to be breached in one of the segments, for example due to a communications link or a router being overloaded, then a segment management module associated with that segment can re-route the traffic.
International application WO 2012/085519 also relates to communications network management and to a communications network which is divided into a plurality of segments, each segment comprising one or more routers and one or more communications links that connect the routers. In this, each segment also comprises a segment management module. Each of the segment management modules reports to a supervisory management module (of which the network may have more than one). If a segment management module predicts that a QoS threshold will be breached, it may re-route a data flow within that segment. If such a re-route is not possible, a request may be sent to the appropriate supervisory management module to initiate a re-routing to a further segment.
International application WO 2011/117570498 relates to a technique for network routing adaptation based on failure prediction, and in particular to a system that predicts network events and triggers a pre-emptive response, and aims to predict network link failures and create a change in the network before the failure actually happens by instigating policy-based adjustment of routing parameters. An example implementation operates in two phases. In the first phase, the historical operation of a network is observed, to determine observed relationships between link or cluster failures that have occurred, and subsequent failures of different links or clusters. From these observed relationships, failure rules can be derived that are then applied to control routing in the network during a control phase. In this, the derived failure rules are applied such that if a link or cluster failure occurs, then from the rules a prior knowledge of what additional links may fail in the next time period is obtained, and remedial action can be taken such as routing data traffic away from the links that are predicted to fail.
In relation to techniques such as the above that predict link failures and link QoS degradation, such that traffic can be re-routed around underperforming link or such that session admission decisions can be made, it has generally been assumed that MIB parameters are available in real-time to the decision-making unit. Such parameters may not be available, however, or may be difficult to obtain and/or keep “concurrent” (i.e. updated appropriately by all instances). For instance, reported metrics can be out-of-date and/or not synchronised with reports from neighbouring routers (due to possible randomness in report generation as well as lags in polling or delays/errors incurred in the network while transmitting such traps to the decision-making unit). It might also generate management traffic loads from all intermediate routers that the operator might find undesirable.
Referring to other citations, United States patent application US2008/080376 (“Adhikari”) relates to techniques for determining locations or other causes associated with performance problems or other conditions in a network, which may be used in network monitoring and analysis systems for the monitoring and analysis of Voice over Internet Protocol (VoIP) communications, multimedia communications or other types of network traffic.
US2006/274760 (“Loher”) relates to techniques for monitoring packet quality over an IP-based network by identifying sets of nodes and deriving the existence of the links between these nodes. The combination of these nodes and links logically make up network paths. Quality measurements performed across the network path can then be attributed to links, nodes, routes, networks, and other components of a communication network.
Japanese patent application JP2007221424 (“NEC”) relates to techniques for measuring communication quality in which end-to-end performances are measured on a number of routes and the minimum value of these is attributed to a link shared by the paths.