The present invention generally relates to wireless digital communication using voice encoding and, more particularly, to a system and method for detecting bad data frames in reception of voice transmissions.
Global System for Mobile communications (GSM) is a mobile telecommunication system with GSM networks operational in several countries around the world. The GSM system operating at 900 mega-Hertz (MHz), and its sibling systems operating at 1.8 giga-Hertz (GHz) (called DCS1800) and 1.9 GHz (called GSM1900 or PCS1900, and operating in North America), are specified according to a GSM standard published by the European Telecommunication Standards Institute (ETSI). A basic telecommunications service supported by GSM is telephony. As with other types of communications, speech is digitally encoded and transmitted through the GSM network as a digital stream.
FIG. 1 shows the general architecture of a generic GSM network 100. GSM network 100 is composed of several functional entities, whose functions and interfaces are specified, for example, according to standards published by ETSI. GSM network 100 can be divided into three subsystems. The mobile station or subsystem 102 is carried by the subscriber, for example, a person using the mobile station 102 as a telephone. A base station subsystem (BSS) 104 controls a radio link 106 with the mobile station 102. A network subsystem 108, the main part of which is a mobile services switching center (MSC) 110, performs the switching of calls between mobile users—such as a person using the mobile station 102, and between mobile users and fixed network users. The MSC 110 also handles the mobility management operations. The mobile station 102 and the base station subsystem 104 communicate across a Um interface 112, also known as an air interface or radio channel 112. The base station subsystem 104 communicates with the mobile services switching center 110 across an A interface 114.
The mobile station 102 includes mobile equipment 116 (the terminal) and a smart card called the subscriber identity module (SIM) 118. The SIM 118 provides personal mobility so that the user can have access to subscribed services irrespective of a specific terminal, i.e. mobile equipment 116. By inserting the SIM 118 card into another GSM terminal (not shown), the user is able to receive calls at that terminal, make calls from that terminal, and receive other subscribed services.
The base station subsystem 104 includes two parts, a base transceiver station (BTS) 120 and a base station controller (BSC) 122. One or more base transceiver stations 120 communicate with a base station controller 122 across a standardized Abis interface 124, allowing (as in the rest of the system) operation between components made by different suppliers. The base transceiver station 120 houses the radio transceivers that define a cell and handles the radio-link protocols with the mobile stations, for example, mobile station 102. In a large urban area, there will potentially be a large number of base transceiver stations 120 deployed. A base station controller 122 manages the radio resources for one or more base transceiver stations 120. The base station controller 122 handles radio-channel setup, frequency hopping, and handovers, for example. The base station controller 122 is the connection between the mobile station 102 and the MSC 110.
The central component of network subsystem 108 is the MSC 110. The MSC 110 acts like a normal switching node of a public switched telephone network (PSTN) or an integrated services digital network (ISDN), and additionally provides functionality needed to handle a mobile subscriber, such as registration, authentication, location updating, handovers, and call routing to a roaming subscriber. These services are provided in conjunction with several functional entities, which together form the network subsystem 108. The MSC 110 provides connection to the fixed networks 126 (such as the PSTN, ISDN, packet switched public data networks (PSPDN), and circuit switched public data networks (CSPDN)). Signaling between functional entities in the network subsystem 108—such as MSC 110 and a home location register (HLR) 128—uses Signaling System Number 7 (SS7), used for trunk signaling in ISDN and widely used in current public networks. For example, the HLR 128 and a visitor location register (VLR) 130, together with MSC 110, provide the call-routing and roaming capabilities of GSM. An equipment identity register (EIR) 132 is a database that contains a list of all valid mobile equipment on the network. An authentication center (AuC) 134 is a protected database that stores a copy of the secret key stored in each subscriber's SIM 118 card, which is used for authentication and encryption over a radio channel 112.
GSM uses a combination of Time-Division Multiple Access (TDMA) and Frequency-Division Multiple Access (FDMA) to share the limited bandwidth of the radio channel 112 among multiple users, i.e., for communication between multiple mobile stations 102 and a base transceiver station 120. FDMA employed by GSM involves the division by frequency of the maximum 25 mega-Hertz (MHz) bandwidth of the radio channel 112 into 124 carrier frequencies spaced 200 kilo-Hertz (kHz) apart. One or more carrier frequencies are assigned to each base transceiver station 120. Each of these carrier frequencies is then divided in time, using a TDMA scheme such as a TDMA scheme 200 shown in FIG. 2.
FIG. 2 illustrates the timing structure of exemplary TDMA scheme 200 used by GSM. The fundamental unit of time in the TDMA scheme 200 is called a burst period (BP)—illustrated by burst periods 202 in FIG. 2. A burst period 202 lasts 15/26 milliseconds (ms)—or approximately 0.577 ms. Eight burst periods 202 are grouped into a TDMA frame 204 ( 120/26 ms, or approx. 4.615 ms). The TDMA frame 204 forms the basic unit for the definition of logical channels. For example, a channel may be defined as one of the burst periods 202 per TDMA frame 204. Channels are defined by the number and position of their corresponding burst periods 202.
A traffic channel (TCH) 206 is used to carry speech and data traffic. The traffic channels 206 are defined using a 26-frame multi-frame 208, which is a group of 26 TDMA frames 204. The length of a 26-frame multi-frame 208 is 120 ms, which is how the length of a burst period 202 is defined (120 ms divided by 26 TDMA frames divided by 8 burst periods per frame). Out of the 26 TDMA frames 204, twenty-four of them are used for traffic channels 206, one is used for a slow associated control channel (SACCH) 210, and one is currently an unused TDMA frame 212, as seen in FIG. 2. Traffic channels 206 for the uplink and downlink are separated in time by 3 burst periods, so that the mobile station 102 does not have to transmit and receive simultaneously, thus simplifying the electronics.
There are four different types of bursts, i.e., data occupying a burst period 202, that are used for transmission in GSM. A normal burst—such as burst 214—is used to carry data and most signaling. The normal burst 214 has a total length of 156.25 bits, made up of two 57 bit information blocks, or data bits 216, a 26 bit training sequence 218 used for equalization, one stealing bit 220 for each information block, i.e., data bits 216,—used for fast associated control channel (FACCH), three tail bits 222 at each end, and an 8.25 bit guard sequence 224, as shown in FIG. 2. The 156.25 bits of normal burst 214 are transmitted in 0.577 ms, giving a gross bit rate of 270.833 kbps.
GSM is a digital system so that speech, which is inherently analog, has to be digitized. The voice encoder/decoder (also referred to as “codec” or “vocoder”) of GSM employs a digitization technique called regular pulse excited—linear predictive coder (RPE-LPC) with a long term predictor loop. Basically, information from previous speech samples, which does not change very quickly, is used to predict the current speech sample. Speech is divided into 20 millisecond samples, each of which is encoded by the vocoder as 260 bits, giving a total bit rate of 13 kbps, called Full-Rate speech coding.
Because of natural and man-made electromagnetic interference, the encoded speech or data signal transmitted over the air interface, radio channel 112, must be protected from errors. GSM uses convolutional encoding and block interleaving to achieve this protection. The algorithms used differ for speech and for data. For speech, the vocoder produces a 260 bit block, called a speech frame, for every 20 ms speech sample. Some bits of this 260 bit block, or speech frame, are more important for perceived speech quality than others. The bits are thus divided into three classes:                Class Ia—50 bits—most sensitive to bit errors;        Class Ib—132 bits—moderately sensitive to bit errors;        Class II—78 bits—least sensitive to bit errors.        
Class Ia bits have a 3 bit cyclic redundancy code (CRC) added for error detection. If an error is detected, the speech frame containing the 260 bit block is judged too damaged to be comprehensible, referred to as a “bad speech frame” or “bad frame”, and is discarded. The discarded speech frame is replaced by an attenuated version of the previous correctly received speech frame. The 53 Class Ia bits, together with the 132 Class Ib bits and a 4 bit tail sequence (a total of 189 bits), are input into a ½ rate convolutional encoder of constraint length 4. Each input bit is encoded as two output bits, based on a combination of the previous 4 input bits. The convolutional encoder thus outputs 378 bits, to which are added the 78 remaining Class II bits, which are unprotected. Thus every 20 ms speech sample is encoded as 456 bits, giving a bit rate of 22.8 kbps.
To further protect against the burst errors common to the air interface, radio channel 112, each speech sample is interleaved. The 456 bits output by the convolutional encoder are divided into 8 blocks of 57 bits, and these blocks are transmitted in eight consecutive time-slot bursts 214, i.e., burst periods 202. Since each time-slot burst 214 can carry two 57-bit information blocks of data bits 216, each burst 214 carries traffic from two different speech samples.
Discontinuous transmission (DTX) is a method that takes advantage of the fact that a person speaks less than 40 percent of the time in normal conversation, by turning the transmitter off during silence periods, thereby conserving power. The most important component of DTX is voice activity detection. DTX voice activity detection distinguishes between voice and noise inputs despite the presence of background noise. If a voice signal is misinterpreted as noise, the transmitter is turned off and a very annoying effect called clipping is heard at the receiving end. If, on the other hand, noise is misinterpreted as a voice signal too often, the efficiency of DTX is dramatically decreased. In addition, when the transmitter is turned off, there is total silence heard at the receiving end, due to the digital nature of GSM. To assure the receiver that the connection is not dead, comfort noise is created at the receiving end by matching the characteristics of the transmitting end's background noise.
The GSM standard published by ETSI, for example, European Standard Telecommunication Series (ESTS) 11.10 Release 1999, at section 14.1 and, in particular, at sections 14.1.1.1 and 14.1.1.2, requires a mobile station, for example, mobile station 102, to meet a certain level of performance in DTX mode with regard to bad speech frames that are passed to the vocoder. If the phone, i.e., mobile station 102, provides multiple types of service, for example, GSM and PCS1900 in the same phone, the mobile station 102 is required to meet the performance criteria at each operational frequency, for example, the 900 MHz of GSM and the 1.9 GHz of PCS1900. The GSM standard refers to the performance that is tested as bad frame indication (BFI), and a bad speech frame that is passed to the vocoder is referred to as an “undetected bad frame”. The GSM standard requires, for example, that less than 0.041% of frames passed to the vocoder be undetected bad frames, which corresponds to less than one undetected bad frame per 60 seconds. For example, in DTX mode, frames that contain silence or only background noise may result in bad speech frames that should be prevented from being passed to the vocoder.
In a full rate speech traffic channel (TCH/F or TCH/FS)—such as traffic channel 206—the fact that only 3 bits are used for the CRC error detection means that there is a significant probability that the CRC error detection will pass a bad frame to the vocoder as a correctly received speech frame, i.e., a good frame. For example, when the input to mobile station 102 is purely white noise or randomly modulated data, each bit of the 3-bit CRC has a 50—50 chance, or probability ½, of correctly matching the random data so that the CRC will pass ½3=12.5% of the random data when none should pass. Because of various technical considerations, for example, that half of the frames may be interpreted as FACCH frames that fail the 40-bit CRC for FACCH, it may be assumed that only 6.25% of the random data will pass CRC, which is unacceptable when compared to the 0.041% required by the GSM standard.
As can be seen, there is a need for bad frame indication for GSM mobile stations that overcomes the limitations of the 3-bit CRC error detection for speech frames by detecting bad frames that have passed the CRC. There is also a need for bad frame indication for GSM mobile stations that prevents passing to the vocoder those frames for which there is no input signal but still pass CRC. Moreover, there is a need for bad frame indication that fails a negligible number of good speech frames when a good signal is being received.