A substitute ring back tone (RBT) providing service, that is one of multimedia services provided in a communication terminal, indicates a type of a service which transmits a predetermined audio signal, selected by a user of a subscriber's terminal to a caller terminal via a communication network when a call connection is requested from the caller terminal to the subscriber terminal, or when the call connection is requested from the subscriber terminal. As an example, when the subscriber terminal requests a called terminal, i.e. a subscriber terminal, for the call connection, the substitute RBT providing service transmits the audio signal ‘I love you’ to the subscriber terminal until a second user of the called terminal answers the call connection, such as by an off hook, and when the caller terminal requests the subscriber terminal for the call connection, the substitute RBT providing service transmits the audio signal ‘I love you’ to the caller terminal until the user of the subscriber terminal answers the call connection, such as by the off hook.
Generally, over communication networks, an audio signal is encoded using a speech codec, i.e. a linear predictive coding (LPC) based codec, instead of using an audio signal exclusive codec to encode the audio signal that is transmitted to the caller terminal or the called terminal. However, when the audio signal is encoded using the LPC based speech codec, the audio signal, generated to be played by the caller terminal or the subscriber terminal, is distorted in comparison to an original audio signal, and a comfort noise occurs during playing of the audio signal due to following reasons.
The speech signal is encoded to be transmitted into a speech codec at a low bit rate since a bandwidth of a speech channel used on a mobile communication network is comparatively narrower than a wired telephone having a bandwidth of approximately 64 kbps. Generally, speech codecs used in the mobile terminal are LPC-based compression methods. The LPC-based speech compression methods are efficient to compress a speech signal of a user in a low/intermediate bit rate since the LPC-based speech compression methods use an optimized model for a vocalization structure of the user, however, a deterioration of a sound quality may occur. The reasons are as follows:
(1) A formant frequency and a pitch period, i.e. most important parameters for a speech compression in the LPC based speech codec, may not be appropriately extracted in an audio signal. A pitch, a parameter corresponding to a fundamental frequency, is generated by a periodical vibration of a vocal cord. In case of the speech signal, a pitch exists in frequency bands from approximately 50 Hz to 500 Hz. Conversely, in case of the audio signal, a pitch may exist in a wider frequency band than the speech signal. Also, a single pitch exists in the speech signal. However, a number of pitches may exist in the audio signal.
(2) Spectra of the audio signal are comparatively complex when compared to spectra of the speech signal. In the case of the speech signal having simple spectra, remaining signals after extracting a parameter may be comparatively fully modeled. This is opposite from the audio signal.
(3) Using a voice activity detection (VAD) and a discontinuous transmission (DTX) in the LPC based speech codec may be one of the reasons. When the user communicates on a terminal, the DTX is used in an interval without speech signals so that speech signals are not transmitted since, according to statistics, no more than 50% of an entire calling time during an actual calling time are speech signals being transmitted/received. Subsequently, an efficiency of a frequency can be improved since a power consumption can be reduced in the subscriber terminal, and a level of an overall interference can be reduced in an air interface. Operation of the DTX is determined depending on the VAD.
To describe operations of the DTX and the VAD, the VAD analyzes at least one parameter with respect to the audio signal extracted from the speech codec, determines whether the audio signal is a speech interval or a speechless interval with respect to the at least one parameter, 1) as a result of the determination, when the audio signal is the speech interval, the DTX transmits the extracted at least one parameter to a demodulator in a receiver, and the subscriber terminal plays the audio signal based on the parameter.
Also, 2) as a result of the determination, when the audio signal is the speechless interval, the DTX generates a minimum parameter to transmit to the demodulator in the receiver, and the subscriber terminal plays a comfort noise according to the parameter. As described above, when the speech codec determines a normal audio signal as a speechless interval, the subscriber terminal may play the audio signal, transmitted to a RBT interval, as the comfort noise. Even when a number of mobile communication service providers are providing the substitute RBT providing service, a corresponding audio signal is inaudible, or is distorted when transmitted to the subscriber terminal due to a deterioration of the sound quality in the substitute RBT play interval.
In order to prevent a cut off phenomenon, occurring in the receiver by determining the audio signal as the speechless signal or playing the comfort noise instead of the audio signal, there are two methods as follows: 1) changing a codec of a base station and a terminal, and 2) transmitting the audio signal via a data network. However, the two methods have problems in that the two methods may bring a change to a great number of present systems, which are already built, and costs may be increased.
Thus, in all applied examples, in a predetermined audio signal transmitted via the communication network, including the example of the transmitting the substitute RBT to the subscriber terminal, a new method enabling a speech codec to determine an interval as a speech interval when encoding a specific audio signal via the communication network is required.