The present invention relates generally to communication systems such as telephone systems and, more particularly, to performance measurements, as for example voice clarity measurements, in such systems, and even more particularly to measurement synchronization between two stations in such systems, and specifically in voice over packet systems.
Telephone companies have expended great efforts over many years to improve the quality of the voice communications that they have traditionally carried. Telephone systems operated by these companies are often referred to as public switched telephone network (PSTNs). While not perfect, voice quality in modem telephone systems has been improved by optimizing various system components for the dynamic range of the human voice and the rhythms of human conversation to the point that they can provide high quality service. High quality voice traffic does not require a large bandwidth but does require timely transmission.
Unlike PSTNs, however, networks which transmit data in discrete packets, such as those that use the Internet Protocol (IP), were developed to support non-real-time applications, such as file transfers and e-mail. These applications feature communication traffic that is bursty and typically requires much higher bandwidths than voice traffic does but is not as sensitive to delays and delay variations as PSTNs are. In addition, such network applications can compensate for packet loss by re-transmitting any lost packets, and the reception of data packets out of order does not present significant problems in data reconstruction.
Recent developments in communication systems have resulted in combining the traffic historically carried separately by telephone and data networks. The service provided by such systems is referred to as Voice over Packet (VoP). The more popular of VoP systems utilize the Internet Protocol (IP) and are commonly referred to as Voice over IP (VoIP) systems. VoP technologies have made maintaining voice quality at high levels more complex by compressing the voice signal and transmitting it in discrete packets. With voice traffic there is the need for timely packet delivery, often in networks that were not originally designed for these conditions. Transmission conditions that pose little threat to non-real-time data traffic can introduce severe problems to real-time packetized voice traffic. These conditions include real-time message delivery, gateway processes, packet loss, packet delay, and the utilization of nonlinear codecs.
Newer PSTN networks use digital-voice transmission for greater efficiency in their backbones. Digitizing analog voice signals often affects voice clarity. The VoIP gateway interconnects the PSTN with the IP network using voice and signaling schemes.
Voice quality as perceived by the user is subjective, but typically his perception of quality includes three key parameters: (1) signal clarity, (2) transmission delays, and (3) signal echos. While the impact on the user is subjective in nature, objective measurement techniques for each of these parameters has been developed. The clarity of a voice signal is generally described by how accurately the received signal reproduces that which was sent. Signal fidelity, lack of distortion, and intelligibility are key elements in the description of its clarity. Delay is the time that it takes to transmit a voice signal from the speaker to the listener. And, echo is the sound of the speaker""s voice that he hears returning to him. Delay and echo can be annoyances and distractions to the user. Any delays in transmission and any echos should be imperceptible to him. A lack of clarity can also degrade the ability of the user to obtain information from the interchange and heighten the level of his frustration.
Packet loss is not uncommon in IP networks. As the network, or even some of its links become congested, router buffers fill and start to drop packets. Another cause of packet loss is route changes due to inoperative network links. An effect similar to packet loss occurs when a packet experiences a large delay in the network and arrives too late for use in reconstructing the voice signal. In the case of real-time voice information, packets must arrive within a relatively narrow time window to be useful in reconstructing the voice signal. Re-transmissions in the case of voice may add extensive delay to the reconstruction and cause clipping, or unintelligible speech.
Voice transmission in a VoP system are coded and decoded via a codec. A speech codec is a device which transforms analog voice into digital bit streams and vice versa. The term codec is a shortened form of coder/decoder. Some speech codecs also use compression techniques which remove less important parts of the signal in order to reduce the bandwidth required for the transmission. In other words, many codecs compress voice signals by preserving only those parts of the voice signal that are perceptually important.
The signal can experience delays from the time it takes for the system or network to digitize, form data packets, transmit, route, and buffer a voice signal. These delays can interfere with normal conversations.
Since users have become accustomed to PSTN levels of voice quality and compare the voice quality of other services to that typically obtained from a PSTN, for VoP services to be acceptable they must maintain this level of quality. Voice quality is now an important differentiating factor for VoP (voice-over-packet) networks and equipment. Consequently, measuring voice quality in a relatively inexpensive, reliable, and objective way has become very important.
One industry standard, objective method for measuring clarity in VoP networks is the perceptual speech-quality measurement (PSQM). PSQM evaluates the quality of voice signals in the same way that codecs encode and decode voice signals. PSQM evaluates whether a voice signal is distorted enough for a human to find it annoying and distracting. It compares a clean voice sample with a distorted version using a complex weighting method that takes into account perceptually important elements, such as the physiology of the human ear and cognitive factors related to what human listeners are likely to notice. PSQM uses an algorithm to provide a relative score that indicates just how different the distorted signal is from the original from the human listener""s perspective. This distortion score corresponds closely to how a statistically large number of human listeners would react in the same test situation using.
Another important method for measuring perceived clarity is the PAMS (perceptual analysis-measurement system). PAMS uses a perceptual model similar to that of PSQM and provides a repeatable, objective means of measuring perceived voice quality. PAMS uses a different but effective signal-processing model and produces different types of scores.
One difficulty in performing either the PSQM or PAMS test is the synchronization of the original and received messages. Typically the user must press the Start button on the receiving station prior to pressing the start button on the transmitting station. The receiving station then must record for a period of time longer than that of the message that was sent. During the analysis phase which follows the recording phase the recorded message is compared to that of the original message. To obtain a meaningful measurement these two signals must be correlated in time, i.e., the recorded file must be scanned to locate the PSQM/PAMS signal. This correlation can be very expensive in terms of computational resources consumed. In addition, the requirement of activating the recording by the receiving station prior to that of the transmitting station makes automatic measurements difficult.
The disadvantages of this synchronization scheme include (1) it puts burden on the user to synchronize test, (2) it cannot realistically be scheduled to run at preselected times, (3) PSQM signal correlation will be very slow, perhaps taking on the order of 1-2 minutes per test, (4) the level of synchronization obtained is inconsistent and operator dependent, and (5) PSQM/PAMS Trend measurements are not possible.
An alternate method is to use a separate network from that of the communication channel under test for the synchronization signals. However, these separate links are often not available to the user. Transmissions such as these are referred to as out-of-band transmissions. While transmissions within the same network are referred to as being in band.
Thus, there exists a need for a synchronization scheme which (1) permits the transmitting and receiving stations to be activated in any order, (2) provides for accurate synchronization between the transmitted and recorded signals, (3) can be automatically activated at preselected times, and (4) does not require operator activation with its inherent timing inaccuracies.
In representative embodiments, methods for synchronizing measurements in communication systems are disclosed. Recent developments in communication systems have resulted in combining the traffic historically carried separately by telephone and data networks. The service provided by such systems is referred to as Voice over Packet (VoP) with the more popular version using the Internet Protocol (IP) commonly referred to as Voice over IP (VoIP). VoP technologies have made maintaining voice quality at high levels more complex by compressing the digitized voice signal and transmitting it in discrete packets. With voice traffic there is the need for timely packet delivery, often in networks that were not originally designed for these conditions. Digitizing analog voice signals often affects voice clarity. Clarity is also affected if packets arrive out of order or are lost.
Voice over Packet systems are advantageous in that they can carry more traffic over the same number of lines than was possible in traditional telephone systems. Further, it is no longer necessary to dedicate specific lines in the network backbone for each connection. Traffic can now take any of many routes through the network and many conversations may share the typically large band-width lines in the network backbone. The advantage of such systems is the more efficient transport of information. The communications that take place in these systems are less noisy than conventional all analogue systems due to the fact that digital rather than analogue data are being transported across the network backbone. Disadvantages include uncertain delays since the messages sent back and forth can take different routes at different times and, therefore, can experience different delays at different times. Once established, however, a route through the network tends to remain the same unless something catastrophic occurs, as for example the failure of a system router or other critical system component.
Objective tests for voice quality are available but are difficult to synchronize between stations. These tests involve the transmission of a test signal from a first communication station to another. Commercially available voice quality testers (VQT) are placed at each communication station. A first one transmits the test signal while a second one records it as it arrives at its location. The recorded signal is then compared to a copy of the message originally transmitted. Aligning these two signals in time can be difficult and time consuming. In methods disclosed in the present patent document, pseudo-random analogue signals which emulate white noise are created and used as synchronization signals which enable this synchronization more precisely than previous methods. These signals are relatively unaffected by the codecs commonly used in communication systems for signal compression.
Difficulties arise in the determination of the exact point in the recorded signal to begin comparison with the copy of the original test signal. In order to make this determination, the two signals must be examined in a time and resource consuming process. Methods for more efficient synchronization of these two signals, i.e., the signal received and recorded by the recorder at the second communication station and the copy of the original test signal maintained by the second voice quality tester, are disclosed herein.
Prior to initiating the steps leading to the measurement of voice quality, both testers must be attached to the two communication stations and turned on. A recorder is activated at the second communication station and the test signal transmitted from the first communication station at times relative to each other based upon synchronization signals passing between.
A first synchronization signal is transmitted by the first voice quality tester. The first synchronization signal is received by the second voice quality tester. Then a second synchronization signal is transmitted by the second voice quality tester. The second synchronization signal is in turn received by the first voice quality tester.
Relative to the time that the second synchronization signal is received at the first communication station, a test signal is transmitted by the first voice quality tester. The recorder is placed in record mode relative to the time that the first synchronization signal is received at the second communication station. This time is set early enough to ensure that it is in record mode prior to the arrival of the test signal. The recorder is left in record mode long enough to ensure that it records all of the test signal.
Alternative embodiments apply successive repetitions of the first and second synchronization signals. Repeating these synchronization signals provide the opportunity for the system to measure the time delays in the system and adjust recorder initiation accordingly and to make appropriate adjustments for jitter, etc. so that the best quality signal can be obtained. The repeated synchronization signals should be of various durations that differ from previous signals such that if a signal is missed or if an echo from an earlier signal is received and is strong enough to be mistaken as a synchronization signal, the system will detect this situation and restart the test. A reasonable choice is to generate a second pair of first and second synchronization signals. Timing of the placement of the recorder in record mode and the transmission of the test signal is then relative to the second pair of first and second synchronization signals. The choice as to the number of repeated synchronization signals is a trade off between more precisely identifying the time delays involved in the transmission of messages between first and second communication stations on the one hand and excessive test times on the other.
Waveforms for synchronization signals other than the pseudo-random chosen are possible. However, it is relatively easy to confirm that a pseudo-random waveform has been received by measuring its intensity. A constant signal level over any arbitrary period of time is expected for the pseudo-random waveform. In representative embodiments, the signal is examined for different time periods in order to confirm that the same signal level is obtained for both time periods. In addition, codecs do not distort pseudo-random signals as they would pure sine waves as would be found in for example the signaling tones typically found in telephone systems, i.e., the dual tone multi-frequency (DTMF) tones. Typically any distortion which would be added to the pseudo-random waveform would not change the waveform. The xe2x80x9cwhite noisexe2x80x9d into the system would be received as substantially unchanged xe2x80x9cwhite noisexe2x80x9d.
While the pseudo-random signal generated appears to be random, it is in fact a completely predetermined waveform. As such, correlation down to the bit level could be obtained between the received test signal and the copy of the test signal. This degree of precision, while available, is typically not required for applications such as that described herein.
Other aspects and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.