The following acronyms are used throughout this description. They are listed in TABLE 1 below for ease of reference.
Digital communications systems, such as digital cellular telephony systems, are often used to transmit voice. Due to the limited bandwidth of these systems, speech is typically encoded to a low bit rate using a speech encoder. Various methods are in use for such speech coding. Within modern digital cellular telephony, most of these methods are based upon Code Excited Linear Prediction (CELP) or some variant thereof. Such speech codecs are standardized and in use for all of the major digital telephony standards including GSM/EDGE, PDC, TDMA, CDMA, and WCDMA.
The present invention is described within the context of GSM. Within this standard, there are currently four standardized speech codecs; three of which are fielded and in common use. The original speech codec is known as the full-rate (FR) codec. This was followed by the half-rate (HR) speech codec which required only half of the bandwidth of the FR codec thereby allowing cellular operators to support twice as many users within the same frequency allocation. This was followed by the Enhanced Full Rate (EFR) speech codec which required the same net bit rate (after channel coding) as the original FR codec but with much improved speech quality.
The GSM standard recently introduced the AMR speech codec. This speech codec will also be used in forthcoming EDGE and 3GPP cellular systems. A similar ACELP-based adaptable speech codec known as the EVRC has been standardized for IS-95 (narrowband) CDMA.
The present invention relates to the Adaptive Multi-Rate (AMR) speech codec. In broad terms, the invention improves the audio quality perceived within an AMR enabled receiver. More particularly, the invention serves to prevent two specific problems that can occur when an AMR enabled receiver is entering or exiting DTX mode. The first problem is that the link may enter DTX but the receiver may not recognize this state change. The result is that random data may be processed by the speech decoder during the DTX period leading to audible artifacts such as clicks and pops. The second problem is that a link in the DTX state may return to active voice but the AMR enabled receiver may not recognize this. The result is that the receiver is muted despite the active state of the link.
One embodiment of the present invention comprises a method of determining whether a receiver in active (non-DTX) mode should remain in active (non-DTX) mode or switch to inactive (DTX) mode. A received AMR frame in active (non-DTX) mode is subjected to a RATSCCH marker comparison. If the results of the RATSCCH marker comparison exceed a RATSCCH marker threshold, the received AMR frame is processed as a RATSCCH message. Otherwise, the received AMR frame is subjected to a SID_FIRST marker comparison. If the results of the SID_FIRST marker comparison exceed a SID_FIRST threshold, then the received AMR frame is processed as a SID_FIRST frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a SID_UPDATE threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is processed as a voice frame in active (non-DTX) mode.
The SID_UPDATE threshold is determined by channel decoding the received AMR frame as a voice frame and performing a CRC test on the channel decoded AMR frame. If the CRC test passes, then a badFrameCounter variable is set to zero, otherwise the badFrameCounter is incremented by one; and the SID_UPDATE threshold is set according to the badFrameCounter.
Another embodiment of the present invention comprises a method of determining whether an AMR enabled receiver in inactive (DTX) mode should remain in inactive (DTX) mode or switch to active (non-DTX) mode. A received AMR frame in inactive (DTX) mode is subjected to an ONSET frame comparison. If the results of the ONSET frame comparison exceed a threshold, then the received AMR frame is processed as an ONSET frame and the receiver is switched to active (non-DTX) mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver remains in inactive (DTX) mode. Otherwise, it is determined whether the received AMR frame is a voice frame, and if so, the receiver is switched to active (non-DTX) mode, and if not, the received AMR frame is classified as a NO_DATA frame and the receiver remains in inactive (DTX) mode.
There are several alternative processes for determining whether the received AMR frame is a voice frame. One method comprises channel decoding the received AMR frame as a voice frame and performing a CRC test on the channel decoded AMR frame. If the CRC test fails, then the received AMR frame is classified as a NO_DATA frame. If the CRC test passes, then a goodFrameCount variable is incremented by one. The goodFrameCount variable is compared against a threshold value and if the goodFrameCount variable exceeds the threshold value, then the received AMR frame is classified as a voice frame. Otherwise the received AMR frame is classified as a NO_DATA frame.
Another method comprises determining if the received AMR frame is a SID_FIRST frame, and if so, setting a framesSinceSID variable to zero and considering the received AMR frame as NO_DATA for purposes of speech decoding. Otherwise, it is determined if the received AMR frame is a SID_UPDTAE frame, and if so, setting the framesSinceSID variable to zero and considering the received AMR frame as NO_DATA for purposes of speech decoding. If the received AMR frame is neither a SID_FIRST or SID_UPDTAE frame then the framesSinceSID variable is incremented by one. Next, it is determined whether the framesSinceSID variable exceeds a threshold, and if not, the received AMR frame is classified as NO_DATA. Otherwise, the received AMR frame is channel decoded as a voice frame and a CRC test is performed on the channel decoded AMR frame. If it passes, the received AMR frame is classified as a voice frame, otherwise the received AMR frame is classified as a NO_DATA frame.
Yet another method of determining whether the received AMR frame is a voice frame comprises channel decoding the received AMR frame as a voice frame and performing a CRC test on the channel decoded AMR frame. If it fails, the received AMR frame is classified as a NO_DATA frame. If it passes the CRC test, then it is subjected to a Viterbi metric threshold test. If it passes the Viterbi metric threshold test, the received AMR frame is classified as a voice frame, otherwise the received AMR frame is classified as a NO_DATA frame.
Still another method of determining whether the received AMR frame is a voice frame comprises performing a carrier-to-interference (C/I) metric threshold test on the received AMR frame. If it fails the C/I metric threshold test, the received AMR frame is classified as a NO_DATA frame, otherwise, the received AMR frame is channel decoded as a voice frame and subjected to a CRC test. If it passes the CRC test, the received AMR frame is classified as a voice frame, otherwise the received AMR frame is classified as a NO_DATA frame.
Still another method comprises performing an inband bit correlation metric threshold test on the received AMR frame. If it fails the inband bit correlation metric threshold test, the received AMR frame is classified as a NO_DATA frame. Otherwise, the received AMR frame is channel decoded and a CRC test is performed. If it passes the CRC test, the received AMR frame is classified as a voice frame, otherwise the received AMR frame is classified as a NO_DATA frame.
It should be noted that the term xe2x80x9creceiverxe2x80x9d as used herein refers to the receiving portion of a cellular transceiving device. A cellular transceiving device includes both a mobile terminal as well as a base station. A mobile terminal must be in communication with a base station in order to place or receive a call. There are numerous protocols, standards, and speech codecs that can be used for wireless communication between a mobile terminal and a base station.
While the present invention is described herein in the context of a mobile terminal, the term xe2x80x9cmobile terminalxe2x80x9d may include a cellular radiotelephone with or without a multi-line display; a Personal Communications System (PCS) terminal that may combine a cellular telephone with data processing, facsimile and data communications capabilities; a Personal Digital Assistant (PDA) that can include a radiotelephone, pager, Internet/intranet access, Web browser, organizer, calendar and/or a global positioning system (GPS) receiver; and a conventional laptop and/or palmtop receiver or other computer system that includes a display for GUI. Mobile terminals may also be referred to as xe2x80x9cpervasive computingxe2x80x9d devices.