1. Field of the Invention
The present invention relates to speech coding in a cellular radiotelephone having answering machine and voice memo capability. More specifically, the present invention relates to a method for parameter-based speech compression and decompression, usable in a cellular radiotelephone for an answering machine and voice memo.
2. Description of Related Art
A Digital Telephone Answering Machine (DTAM) is used for saving audio messages, sent from far away and received by way of a base station, in a memory of a digital cellular radiotelephone, when the radiotelephone is off or cannot receive signals for another reason. Conventionally, this service is provided by the cellular communication system base station, which stores messages in its computer's voice mail database. Therefore, messages are only available by calling the base station. A Voice Memo is used for local speech, by the owner of the radiotelephone, for saving his or her own messages for future use. It may also be used for recording local conversations.
Linear Predictive Coding (LPC) is used extensively in digital speech transmission, speech recognition and speech synthesis systems which must operate at lower bit rates. The efficiency of LPC arrangements results from a method used in encoding the speech information. The LPC coding modules first sample an input speech message at a predetermined rate, and then partition the speech samples into a sequence of full rate time frames 5 to 20 milliseconds in duration. The speech signal is quasi-stationary during such time intervals and may be characterized as a relatively simple vocal tract model specified by a relatively small number of parameters.
During encoding, for each time frame a set of linear parameters is generated and saved in a parameter frame. The parameters are representative of the spectral content of the speech pattern. The encoded data thus consist of parameters which correspond to the shape of the user's vocal tract and its excitation. The bandwidth of the parameter set is substantially less than the bandwidth of the speech signals. Such parameters may be later applied to a linear filter of a decoder which models the human vocal tract, along with signals representative of the vocal tract excitation, to reconstruct a replica of the speech pattern.
There are many different types of digital speech coders usable for wireless communication. Some of the coders are RPE-LTP (FR), ACELP (EFR), QCELP (CDMA), and VSELP (CDMA). Each cellular radiotelephone typically has several speech coders, which may be of different type. Analysis by Synthesis speech coders are the typical LPC coders used in cellular communication systems. All versions of the LPC speech coders share the same speech parameter frame format, which consists of an LPC frame followed by four subframes. The subframes save pitch and noise information about the speech sequence. In a 20 msec speech frame, each subframe typically contains little less than 5 ms of speech.
During encoding, a sequence of frames of a speech signal is compressed in a speech encoder, which stores parameters of the speech signal in speech parameter frames, which are parts of speech records, to ensure better coding quality during decoding. Moreover, parameters stored in the record header define the coder used during encoding, so that in systems that support multiple coders the decoding is performed with the same coder characteristics which are known and saved. For a GSM FR sequence of 260 bits, as used by a RPE-LTP coder, 50-76 coefficients are saved in the header of the speech parameter record. The choice of the coder type (such as GSM FR, EFR, and HR, or CDMA QCELP and EVCR) may also depend on the characteristics of a base station.
DTAM/Voice Memo for wireless radiotelephones differs from analog answering machines which record messages on a magnetic tape, in that, in a DTAM/Voice Memo, speech parameter frames must be stored in a memory chip. Because a typical memory chip of a cellular radiotelephone has to be small in size, the memory chip has limited storage capacity, presently up to 4 MB of RAM, and can only save short messages. Each recording sample has 2-3 seconds of recording. Because the typical bit rate is 13-16 Kbits/sec (Kbps), only a message or conversation shorter than 6 minutes could be saved in 4 MB of RAM.
The bit rate in a conventional DTAM/Voice Memo device has to be much higher than in speech coders used in regular telephones (8 to 13 Kbps). Even with a high rate of speech compression, two speech coders would have to run at the same time in a conventional DTAM/Voice Memo device, to separately implement the DTAM and Voice Memo utilities. This presents problems in terms of the limited duration of talk time which can be saved and higher cost due to the extra resources required.
It is desirable to reduce the amount of code bits saved for each speech signal frame in order to provide greater economy of storage of messages in a DTAM/Voice Memo, and, possibly, economical usage of transmission facilities.