VoIP Services
Voice over IP (“Voice over IP”—IP denoting the Internet Protocol) networks are packet-switched phone networks. In contrast to their circuit-switched predecessors (e.g. the PSTN) the control plane (signaling information—who calls whom) may use a different path through the network than the media plane (media—the call content). The media plane is sometimes also referred to as the user plane. VoIP services can be considered to consist of a signaling plane and a media plane. On the signaling plane various protocols describe the communication session (call) flow in terms of involved parties, intermediary VoIP entities (i.e. VoIP proxies, routers) and the characteristics of the VoIP service features. The media plane typically carries the media information (i.e. audio and/or video data) between the involved parties. Neither the media plane nor the signaling plane alone is sufficient to implement and provide a VoIP service. On the signaling plane protocols like SIP (see IETF RFC 3261, “SIP: Session Initiation Protocol”, available at http://wwwdotietfdotorg) or ITU-T recommendation H.323 (see H.323, “Packet-based multimedia communications systems”, Edition 7, 2009, available at http://wwwdotitudotint) are commonly used, whereas protocols like RTP (Real-time Transport Protocol, see IETF RFC 3550, “RTP: A Transport Protocol for Real-Time Applications”, available at http://www.ietf.org), MSRP (see IETF RFC 4975, “The Message Session Relay Protocol (MSRP)”, available at http://www.ietf.org) or ITU recommendation T.38 (see T.38, “Procedures for real-time Group 3 facsimile communication over IP networks”, Edition 5 (2007) or Edition 6 (2010), available at http://wwwdotitudotint) may be present on the media plane.
In contrast to the traditional PSTN (Public Switched Telephone Network) network both planes may be on different infrastructure using different protocols and even take different routes through a network.
Quality Monitoring in of Communication Sessions in Packet-Switched Networks
Network operators offering communication sessions, e.g. VoIP calls, need to monitor the service they offer. Typically this is done on the control plane as monitoring the media plane is generally considered more complex. For both of the monitoring methods a multitude of key performance indicators (KPI) are available. The answer seizure ratio (ASR) for instance is one of the most basic control plane KPIs. By comparing the number of call attempts to the number of successful calls an indication on the network performance and availability can be given.
User plane KPIs describe aspects of the conversational quality of a call. For example, a KPI may reflect the IP transport characteristics of the user plane by means of determining a mean opinion score (MOS) value or the rate of standard conformance violations determined by performing media payload in-depth analysis. Analysis of the actual media payload is computationally expensive and may be considered a breach of the confidentiality of the conversation.
To deduct media quality-related information from the control plane (i.e. without actually looking at the user plane) a simple assumption is made: If the call quality is unacceptable the parties will hang up the call. This call hangup can be monitored on the control plane. On the control plane the reason for the call termination cannot be determined as one of the involved parties simply hung up the call—which is also true for calls with acceptable media quality. Providers may use the average call duration (ACD) KPI which monitors the average call duration of calls. This is also known as mean holding time (MHT) or average length of call (ALOC). If the ACD KPI value drops below a certain threshold the service provider may investigate is the reason. However, the average call duration is not solely dependent on media quality. Other influencing factors include time of day, call destination or control plane issues.
Pure control plane solutions fail to identify unacceptable media quality as reason for call termination as there is no means to signal quality-related hangup causes. Features such as the Q.850 Reason code (see IETF RFC 3326, “The Reason Header Field for the Session Initiation Protocol (SIP)”, available at http://wwwdotietfdotorg) are only available during the call setup. The network is not able to distinguish between a call where a party hangs up because the conversation is over and a call in which a party ends the conversation due to bad quality and consequently hangs up.
Some operators try to compensate for this by simply looking at the average duration of the calls (ACD). If the average call duration decreases below an artificial threshold the operator may start troubleshooting in order to locate problematic calls and fix potential network issues. The problem with the ACD calculation is that is requires numerous calls over a longer period of time to for instance calculate the average call duration of calls within a 15 minute interval. This means that any deviation may only be visible after 15 minutes. Legitimate calls which just happen to be very short do have a negative impact on the ACD calculation and may trigger a false alarm. Announcements with less than 10 second call hold time are a good example for this category of calls.
More sophisticated solutions may take the user plane into account. A MOS value may be delivered by the endpoint through an RTCP report. A passive midpoint monitoring solution may determine a MOS value for the media streams of a given call. Active monitoring solutions which conduct artificial calls may also provide information on quality problems of the media plane, though such information need not be valid for individual real calls made by actual service users.
MOS values provided by either endpoints or passive midpoint solutions have to fulfill certain conditions. MOS values provided by endpoints may not be trustworthy as the end customers have an incentive to forge wrong reports and artificially report bad voice quality so that they do not have to pay for their calls. Moreover, MOS values provided by both endpoints and passive midpoint solutions need to have a high granularity in order to be useful for this application. An average MOS value over several minutes would not show a significant impact if only the last few seconds have been impaired. Even a three minute call with a MOS score of 4.41 would only degrade to a MOS score of 4.22 if the last 10 seconds where inaudible (MOS score of 1 assumed). An average MOS score of 4.11 would still be considered very good and would fall in the highest MOS class specified by Recommendation ITU-T G.107, “The E-model: a computational model for use in transmission planning”, April 2009 (available at http://wwwdotitudotint).
Active Vs. Passive Monitoring of Communication Sessions in Packet-Switched Networks
VoIP monitoring solutions available in the market fall into two main categories, which are active monitoring system and passive monitoring system. The fundamentally different monitoring approaches have their respective applications, but are not interchangeable.
Active monitoring solutions basically work with a sender and a receiver device, with the network infrastructure to be tested in between. That makes active testing much easier since the information transmitted by the sender device is known. The receiving device can easily compare the received signal with the known sender signal and analyze any differences. It is clear that the network must have introduced any monitored impairments, since sender and receiver device can be assumed to be ideal, standard compliant and free of errors.
Active testing is usually a good choice when pretesting of network infrastructure has to be done, i.e. when there are no other active media stream transmitters in the network deployed yet. Another application for active testing is to perform basic availability testing as well as one-way delay measurements. Active testing can be used to automatically detect the availability of specific segments of a complex network infrastructure, or to test the availability and response times of major infrastructure equipment dedicated to media transmission.
Since network impairments are of transient nature and because active testing does not consider the end user equipment used to transmit and receive media streams, active testing is no option when it comes to 24/7 monitoring of VoIP services at call service providers. This is where passive monitoring solutions play a major role.
Passive monitoring of media streams to assess their quality as well as to determine if problems originate on the sender or network part of the transmission path is different, as neither the sending nor the receiving endpoints are under the control of the monitoring system. All quality evaluation performed by passive solutions must be derived from the packets passing by the tapping point to which the monitoring probe is attached. Passive monitoring system will be decoupled from the network under test and may only receive copies of packets. Passive systems are not actively taking part in the traffic carried over the tested network. No additional traffic is introduced, which provides a straight view on the transmission performance of the monitored network infrastructure.
Drawbacks of passive monitoring solutions are that they have no information about the sender or receiver of the media streams as well as their error correction capabilities. Also quality information is not collected end-to-end but is provided as if the receiver would be located at the tapping point to which the passive monitoring probe is attached. The last point can however also become an advantage if multiple passive probes are installed along the transmission path, because in that case more, fine grained information becomes available. In that case locating network segments introducing impairments becomes more easily possible.