In traditional circuit-switched telephony, such as the public switched telephone network (PSTN), ringback is an audible signal or tone, usually mimicking the sound of a phone ringing. Typically, ringback is heard by a caller after the caller dials a callee, but before the callee answers the call. Ringback is meant to indicate progress to the caller, even though the ringback tone may not be synchronized with a ringing, or any other indication, that is presented to the callee.
In recent years, personalized ringback has become popular. This feature allows a PSTN subscriber to specify an audio selection for ringback, so that anyone calling the subscriber will hear the audio selection rather than the default ringback. Furthermore, the PSTN subscriber may choose to specify that different audio selections be played to different potential callers. These audio selections can take any form, but are usually based on audio clips or files of content such as popular music, speeches, or sound effects.
Personal ringback tones can be applied to both wireline communication devices, such as desktop phones, and wireless communication devices, such as cell phones. Regardless of the type of device, in these traditional systems, the ringback tone is generated by the network and then streamed or otherwise transmitted in-band to the caller's device at the time of call setup.
The PSTN was designed to efficiently transport voice signals. Most of the spectral energy in a human voice signal occurs below 4000 Hz, so PSTN equipment typically filters all input signals above 4000 Hz. But, the human ear can perceive audio signals of up to 20,000 Hz. Therefore, PSTN filtering may produce an audio signal that exhibits a lower quality than the same audio signal prior to being subjected to filtering. For example, some people sound differently over the PSTN than they do in person because a portion of the energy in their voice signal is not present for listeners to perceive.
Additionally, the use of Internet telephony, such as voice over Internet Protocol (VoIP) may further impair the quality of voice signals. VoIP codecs are designed to compress voice signals into low-bitrate streams. For example, the International Telecommunications Union (ITU) G.723.1 codec may compress an input signal of 64 Kbps to an output signal of less than 8 Kbps. In order to do so, these codecs may utilize psychoacoustic models that reduce or eliminate spectral components from input signals that are less likely to be produced by a human vocal tract and less likely to be audible to human hearing. In the process of doing so, these codecs potentially discard even more spectral energy from the input signal than PSTN filtering.
As a result of this lossy PSTN filtering and VoIP compression, audio signals played out subject to one or both of these procedures may exhibit a lower perceived quality than the same audio signals without being subjected to filtering or compression. Furthermore, audio signals of music tend to have more spectral energy above 4000 Hz than audio signals of voice, and, when compressed by a voice codec, result in poor quality. So, even if a high-quality audio file is chosen to be personalized ringback, if it is transmitted in-band over the PSTN or a VoIP bearer, the caller will likely notice that the resulting audio signal is actually of very low quality.