1. Field of the Invention
The present invention relates to vocoders, and more particularly to the representation of the fixed codebook response generated thereby.
2. Description of the Background Art
FIGS. 1 and 2 illustrate transmitting and receiving units of a code excited linear prediction (CELP) vocoder, in accordance with the background art. In FIG. 1, the transmitting unit is a first vocoder 1. The first vocoder 1 includes a linear predictive coding (LPC) filter 2. The LPC filter 2 is connected a perceptual weighing filter 3 via a junction 4. The perceptual weighing filter 3 is connected to an error minimization filter 5. The error minimization filter 5 is connected to a first adaptive codebook 6 and a first fixed codebook 7. The first adaptive codebook 6 is connected to a first adaptive codebook gain unit 8. The first fixed codebook 7 is connected to a first fixed codebook gain unit 9. The outputs of the first adaptive codebook gain unit 8 and the first fixed codebook gain unit 9 are connected at a junction 10. The junction 10 is connected to the junction 4.
Generally, the first vocoder 1 sequentially analyzes time segments of a digital speech input. Each time segment is referred to as a signal frame. The vocoder 1 estimates parameters characterizing each signal frame. The parameters are represented by bit patterns, which are assembled into a bit frame. The bit frames can be transmitted more quickly, or stored in less memory, than the signal frames which they represent.
Now, with reference to FIG. 1, a general description of the operation of a known IS127 EVRC CDMA type coder (vocoder 1) will be given. For more detail on the operation of the vocoder 1, reference can be made to textbooks relating to digital speech coding. The vocoder 1 is a multi-rate vocoder and has a full rate of operation corresponding to 8 kilo bits per second (kbps) and a half rate of operation corresponding to 4 kbps. The digital speech input is divided into signal frames of 20 msec. Each signal frame is further divided into first, second, and third sub-frames of approximately 6.6 msec.
When the vocoder 1 operates at full rate, a signal frame passes through the LPC filter 2, which extracts LPC parameters characterizing the entire signal frame and outputs the LPC parameters in the form of twenty-eight LPC bits. The signal frame leaves the LPC filter, passes through the junction 4, the perceptual weighing filter 3, and the error minimization filter 5. The perceptual weighing filter 3 and the error minimization filter 5 do not extract parameter bits from the signal frame, but prepare it for later processing.
Next, the signal frame is received by the first adaptive codebook 6. The first adaptive codebook 6 estimates a pitch for the entire frame, and outputs seven ACB bits characterizing the pitch of the entire frame. Then, the first adaptive codebook gain unit 8 estimates an adaptive codebook gain of the first sub-frame, the second sub-frame, and the third sub-frame. Three ACBG bits estimate the adaptive codebook gain of the first sub-frame. Three more ACBG bits estimate the adaptive codebook gain of the second sub-frame. And, still three more ACBG bits estimate the adaptive codebook gain of the third sub-frame.
Next, the signal passes through the junction 10, the junction 4, the perceptual weighing filter 3, and the error minimization filter 5, and is received by the first fixed codebook 7. The first fixed codebook 7 estimates the random, unvoiced characteristics of the first sub-frame, the second sub-frame, and the third sub-frame. Thirty-five FCB bits represent the fixed codebook response for the first sub-frame. Thirty-five more FCB bits represent the fixed codebook response for the second sub-frame. And, still thirty-five more FCB bits represent the fixed codebook response for the third sub-frame.
Next, the first fixed codebook gain unit 9 estimates a fixed codebook gain of the first sub-frame, the second sub-frame, and the third sub-frame. Five FCBG bits estimate the fixed codebook gain of the first sub-frame. Five more FCBG bits estimate the fixed codebook gain of the second sub-frame. And, still five more FCBG bits estimate the fixed codebook gain of the third sub-frame.
At this point, all of the bit patterns (LPC, ADC, ADCG, FCB, FCBG) are assembled into the bit frame. The bit frame, representing the signal frame, is complete and can be transmitted to a second vocoder 11 for synthesis, or stored in a memory for later retrieval. The above process sequentially repeats itself for each signal frame of the digital speech input.
FIG. 2 illustrates a decoding section of the second vocoder 11 for synthesizing the bit frames. The second vocoder 11 includes a second adaptive codebook 12, a second fixed codebook 13, a second adaptive codebook gain unit 14, a second fixed codebook gain unit 15, and a synthesis filter 16. The second vocoder 11 receives the LPC bits, ACBG bits, ACB bits, FCB bits, and FCBG bits. These bits are used by the second vocoder 11 to reconstruct an estimate of the original signal frame, in a manner well known in the art.
The total number of bit positions within the bit fame allocated to the various parameters, as given above, relate to the vocoder 1 (IS127 EVRC CDMA coder) operating at a full rate of 8 kbps. To summarize, the bit frame would include: 28 LPC bits; 7 ADC bits; 3+3+3=9 ACBG bits; 35+35+35=105 FCB bits; and 5+5+5=15 FCBG bits. Therefore, the total number of bits in the bit frame would be 164 bits.
As mentioned above, the vocoder 1 is a multi-rate vocoder, and the half rate of the vocoder 1 is 4 kbps. When the vocoder 1 operates at the half rate, it is no longer possible to transmit bit frames having a size of one hundred and sixty-four bit positions, while still keeping up with an incoming digital speech input, in real time. Instead, the bit frame size must be reduced to approximately eighty bit positions.
When the vocoder 1 (IS127 EVRC CDMA coder) operates at its half rate (4 kbps), the bit position are rationed in the following order: 22 LPC bits; 7 ACB bits; 3+3+3=9 ACBG bits; 10+10+10=30 FCB bits; and 4+4+4=12 FCBG bits. Therefore, the total number of bits in the bit frame would be 80 bits. It can be seen that the FCB bits suffer the predominate share of the bit frame""s reduction in size.
Since the present invention concerns the fixed codebook, a brief summary of the operation of the fixed codebook computation in the vocoder 1 is in order. In the full rate (8 kbps), the one hundred and five bit positions allocated toward representing the fixed codebook response for the frame have the ability of placing eight estimation pulses in each of the three sub-frames. Graphically this is represented in FIG. 3.
In FIG. 3, a first signal line 17 is illustrative of a second residual signal presented to the fixed codebook 7 for estimation. The first sub-frame 18 is divided into fifty-three sample points, the second sub-frame 19 is also divided into fifty-three sample points, and the third sub-frame 20 is divided into fifty-four sample points.
In order to best estimate the characteristics of the second residual signal on signal line 17, positive and/or negative pulses 21 are located at select ones of the sample points. For example, second signal line 22 illustrates the polarities and placements of the pulses 21, in estimating the second residual signal of first signal line 17. The placements and polarities are the data characterized by the FCB bits for each of the sub-frames 18, 19, 20. In other words, for each sub-frame, the fixed codebook 7 estimates the best placement of eight to ten pulses 21 to represent the second residual signal of the first signal line 17, and the FCB bits for that sub-frame identify the placements and polarities of the pulses 21.
When the second vocoder 11 receives the FCB bits, an envelope 23 can be mathematically constructed based upon the placement of the positive and negative pulses 21 in order to provide an estimation to the second residual signal of the first signal line 17. Graphically this is illustrated on third signal line 24. Of course, the FCBG bits of each of the sub-frames would influence the amplitude of the peaks and valleys of the envelope 23 within the respective sub-frames, so that the amplitudes of the peaks and valleys of the envelope 23 match the average amplitude of the actual peaks and valleys within the second residual signal.
When the vocoder 1 operates at full rate (8 kbps), the one hundred and five bit positions within the bit frame, allocated to the fixed codebook response, can represent the positions and polarity of eight pulses per sub-frame, as illustrated by the second and third signal lines 22 and 24. When the vocoder 1 operates at half rate (4 kbps), the thirty bit positions within the frame, allocated to the fixed codebook response, can only represent the positions and polarity of three pulses per sub-frame.
A fourth signal 25 illustrates the placement of the positive and negative pulses 21xe2x80x2 when the vocoder 1 operates at its half rate and the envelope 23xe2x80x2 constructed mathematically in accordance with the placement of the pulses 21xe2x80x2. It can clearly be seen that the envelope 23xe2x80x2 developed during the half-rate of operation does not approximate the second residual signal of the first signal line 17, nearly as well as, the envelope 23 developed when the vocoder 1 operates at its full rate.
It has been observed that the first and second vocoders 1, 11 process digital speech with sufficient reproduction quality when a medium to high bit rate is used during transmission of the bit frames (e.g. 4.8 kbps to 16 kbps). However, when bit rates are below 4.8 kbps (such as the 4 kbps rate, corresponding to the half-rate), the quality of the synthesized speech suffers greatly. The poor quality is primarily due to the inaccurate representation of the fixed codebook response of the sub-frames, as illustrated by the fourth signal line 25 in FIG. 3.
The poor representation is the result of the limited number of bits (e.g. thirty bits) allocated within the bit frame to represented the fixed codebook response of all of the sub-frames. Since the bit frame size cannot be increased when the bit rate is low, there exists a need in the art for a vocoder, and method of operating a vocoder, which can more accurately represent a fixed codebook response of a signal frame, or sub-frames, while doing so with a limited number of bit positions within the bit frame.
A vocoder, in accordance with the present invention, includes a fixed codebook having a plurality of entries of pulse sequences for comparison to a residual signal of the signal frame or sub-frame. The entries of the fixed codebook are tailored to the signal frame or sub-frame being encoded. A noise signal is stored in a transmitting vocoder. During encoding, the noise signal is shaped by filtering dependent upon determined parameters which characterize the signal frame or sub-frame. The shaped noise signal is passed though a thresholding filter to arrive at a pulse sequence. The fixed codebook response is chosen as that portion (i.e. entry) of the pulse sequence which best matches the residual signal of the signal frame or sub-frame. The indexed location of that portion is designated as the fixed codebook bits which are included within the bit frame. An identical noise signal is also stored in a decoding vocoder. The same active filtering and threshold filtering are applied to the identical noise signal to arrive at a same pulse sequence. Therefore, the fixed codebook bits, of the bit frame, will index the proper portion of the pulse sequence which represents the fixed codebook response to be used during synthesis.