1. Field of the Invention
The present invention relates to insuring the accuracy of transmitted or stored digital data of a multi-mode vocoder.
2. Description of the Related Art
Vocoders are known in the existing arts. Briefly, a vocoder processes a digital speech signal by sequentially breaking the digital speech signal into segments. Next, the vocoder derives various parameters relating to each segment, such as a pitch value, pitch gain, fixed codebook response, etc. The derived parameters are characterized by bit patterns, which are assembled into a frame. Each frame is representative of the original speech signal segment. The sequential frames are compressed, relative to the original segments, and therefore can be transmitted more quickly, or stored in less memory, than the original segments.
When the transmitted frames are received, or the stored frames are retrieved, another or the same vocoder must decompress the frames in order to reconstruct, or synthesize, a recognizable voice approximating the original digital speech signal. When decompressing a frame, it is important to determine if a transmission, or encoding error, has occurred. If an error goes undetected, the quality of the synthesized speech relating to the erroneous frame will be impaired. If an error is detected, the frame can be ignored, or estimated relative to preceding and/or succeeding frames, thereby improving the overall quality of the reproduced voice.
FIG. 1 illustrates first and second vocoders in accordance with the background art. The first vocoder 1 includes a first pre-processing unit 2, a mode selector 3, a compression unit 4, a code builder 5, and a first post-processing unit 6. The second vocoder 7 includes a second pre-processing unit 8, a code analyzer 9, an estimation unit 10, a mode reader 11, a synthesizer 12, and a second post-processing unit 13.
With reference to FIG. 2, the first pre-processing unit 2 receives an input signal in step 14. The first pre-processing unit 2 conditions the input signal for later processing. For example, if the input signal is an analog speech signal, the first pre-processing unit 2 would convert the analog speech signal into a digital speech signal. Also, the first pre-processing unit 2 will divide the digital speech signal into a sequential series of signal segments.
In step 15, the mode selector 3 analyzes the signal segment and determines a type of the digital speech signal contained therein. For instance, the speech signal could be a voiced type of speech signal. An example of a voiced speech signal would be a vowel sound. In characterizing a vowel sound, certain tonal parameters, like pitch delay and pitch, are relatively important. Another type of speech signal would be an unvoiced speech signal. An example of an unvoiced speech signal would be an xe2x80x9csxe2x80x9d sound, or any sound resembling noise or static. In characterizing an unvoiced sound, the pitch parameters are relatively unimportant, rather parameters like a fixed codebook output are important. Of course, the mode selector 3 could determine other types of speech signals, and it is important to note that, the mode of a digitized speech signal could change one hundred times a second.
In step 16, the compression unit 4 derives characteristic parameters relating to the signal segment. The compression unit 4 includes various components, such as an adaptive codebook, fixed codebook, impulse response unit, linear predictive coder, etc. The parameters obtained by the various components relate to attributes of the signal segment, such as pitch, pitch gain, fixed codebook output etc. The compression unit 4 assigns bit patterns to characterize the derived parameters. It should be noted that steps 16 and 15 may occur in reverse order, or be interrelated. In other words, outcomes of step 16 may be the basis of the mode selection of step 15.
In step 17, the compression unit 4 assembles the bit patterns into a frame. A typical frame may consist of one hundred to two hundred bits, although it is envisioned that the frames could have any number of bits. FIG. 3 is illustrative of two sequential frames produced by the compression unit 4. The pitch is characterized by the bits residing in bit positions three through six of the frame and the pitch gain is characterized by the bits residing in positions ninety-five through ninety-nine of the frame. The non-illustrated bit positions would contain other information characterizing the speech signal segment. Of course, the positioning of the characterizing information within the frame and the number of bits allocated to each parameter can be varied.
As illustrated in FIG. 1, the compression unit 4 receives the mode from the mode selector 3. Depending upon the mode, the compression unit 4 will allocate greater importance to the parameters which best characterize the mode""s respective type of speech signal. For instance, if a voiced speech signal is processed, then more bits, and hence greater resolution, will be afforded to the pitch and pitch gain parameters. The additional bits used for the pitch and pitch gain parameters may be taken from the less important parameters of a voiced speech signal, such as the random parameters. If an unvoiced speech signal is processed, then more bits may be afforded to the fixed codebook output parameter, at the expense of the pitch and pitch gain parameters.
It would also be possible for the positioning of the various parameters within the frames to vary between the different modes. For instance, in the mode corresponding to a voiced speech signal, the pitch parameter would occupy the bit positions between four and fourteen, whereas in the mode corresponding to an unvoiced speech signal, the pitch parameter would occupy the bit positions between twenty and twenty-three.
FIG. 4 illustrates four modes of the first vocoder 1. Of course, the first vocoder 1 could have more than four modes. Each mode has a plurality of important bits, labeled xe2x80x9cBxe2x80x9d, and a plurality of unimportant bits, labeled xe2x80x9cbxe2x80x9d. An important bit xe2x80x9cBxe2x80x9d means that the data in the bit position relates to an important parameter for the particular mode, e.g. type of speech. For example, the bit positions representing pitch are important bit positions in the mode representing voiced speech signals. It can be seen that both the number and position of the important bits xe2x80x9cBxe2x80x9d will vary between the different modes. Typically, the number of important bits in a given mode will be between forty to one hundred bits, with the remaining bits being of reduced importance in the later reconstruction of the speech signal.
Referring to FIG. 2, in step 18, the code builder 5 builds a cyclical redundancy check (CRC) code based upon the potentially important bits within the frame. The CRC code would be one or more bits added to the frame, whose purpose is to ensure the accuracy of the potentially important bits in the frame. One example of a CRC coding formula would be the repetition of each of the potentially important bits within the frame. In this instance, the CRC code would be robust, i.e. would provide a high level of assurance that no error occurred in the important bits, but would require a large number of bits. Another example of a CRC coding formula would be a simple one-bit parity check of the potentially important bits. In this instance, the CRC code would require only one bit, however the accuracy of the important bits might not be adequately insured. A good compromise would be a CRC coding formula based upon a polynomial of the potentially important bits. Such a form of CRC coding is known in the art.
As illustrated in FIG. 4, the different modes have differing numbers of actually important bits xe2x80x9cBxe2x80x9d. Further, the locations of the important bits xe2x80x9cBxe2x80x9d vary between the different modes. Therefore, in order to assure that all potentially important data in a frame, regardless of the mode, is protected, the CRC coding formula is a master coding formula and protects each bit of a frame, which could potentially contain an important bit xe2x80x9cBxe2x80x9d in the various modes. For example, in FIG. 4, of the illustrated bits, bits 00, 01, 02, 03, 04, 06, 07, 09, 97, 98, and 99 could potentially contain an important bit xe2x80x9cBxe2x80x9d, depending upon the mode. Of the illustrated bits, only bits 05, 08 and 10 are unimportant bits xe2x80x9cbxe2x80x9d, regardless of the mode. Therefore, the CRC coding formula would involve bit positions 00, 01, 02, 03, 04, 06, 07, 09, 97, 98 and 99 to arrive at a CRC master code.
Referring to FIG. 2, in step 19, the first post-processing unit 6 transmits the frame, which includes the CRC code. The sequentially transmitted frames, hundreds per second, are sent via a hardwired or wireless medium to the second vocoder 7. In step 20, the second pre-processing unit 8 receives the frame. In step 21, CRC code analyzer 9 intercepts the CRC code bits of the frame. In step 22, the code analyzer 9 determines if the bits within the various potentially important bit positions of the frame, after having the master coding formula applied thereto, match the CRC code. If no match occurs, the frame is erroneous and labeled xe2x80x9cbadxe2x80x9d, and the process goes to step 23. If a match occurs, it is assumed that no error occurs, the frame is labeled xe2x80x9cgoodxe2x80x9d, and the process proceeds to step 24.
In step 23, the xe2x80x9cbadxe2x80x9d frame is replaced with an estimated frame by the estimation unit 10. The estimated frame will include estimations of the characterizing parameters contained in the frame. The location and resolution of the estimated characterizing parameters within the frame will be dictated by an estimation of the mode of the frame. The estimated frame could simply be identical to the previous frame (in which case the mode would be the same), or could be estimated based upon prior and/or future frames (in which case the mode of the frame could change). In any event, the estimated frame should result in the overall quality of the reproduced speech being improved, since the known erroneous frame will have been detected, removed and replaced. The estimated mode, as estimated in step 23, can be sent directly to the synthesizer 12, or encoded into the estimated frame to be read by the mode reader 11 in step 24.
In step 24, the mode reader 11 determines the mode of the frame. The synthesizer 12 receives the mode from the mode reader 11. In step 25, based upon the mode, the synthesizer 12 synthesizes, or reconstructs, the digital speech signal segment from the characterizing parameters represented by the bit patterns within the frame, albeit the original frame of step 20 or the estimated frame of step 23. In step 26, the second post-processing unit 13 sequentially outputs the synthesized digital speech signal segments.
The process, in accordance with the background art as detailed above, suffers several disadvantageous. First, the master coding formula used by the code builder 5 in step 18 causes erroneous frames to be detected more often than necessary by the code analyzer 9 in step 22. This occurs because the CRC master coding formula is protecting unimportant bits in any given mode. For example, in FIG. 4""s mode 04, the CRC master coding formula would incorporate bit position 07, even though bit position 07 is an unimportant bit xe2x80x9cbxe2x80x9d. Therefore, if an error occurred in bit position 07, the CRC code check in step 22 would label the frame a xe2x80x9cbadxe2x80x9d frame, and the frame would be replaced in step 23. This is unfortunate since the frame, if synthesized, would have been sufficiently accurate, and most likely more accurate than any estimated frame constructed in step 23. Moreover, the construction of estimated frames in step 23 takes processing time and slows the rate at which data can be transmitted to the second vocoder 7. By reducing the number of erroneous frames detected, the data flow rate can be increased.
A second drawback is that the master CRC master coding formula is relatively less robust because it incorporates every bit position which could possibly include an important bit xe2x80x9cBxe2x80x9d under the various modes. The robustness of a CRC coding formula, i.e. its ability to detect an error in the data for which it is protecting, is directly related to the number of bits in the CRC code and the number of bits that go into the CRC coding formula which produces the CRC code. Therefore, if it is possible to reduce the number of bits being protected, i.e. being used in the CRC coding formula, the robustness of the CRC code will be improved.
One object of the present invention is to provide a method of detecting errors in data received by a multi-mode vocoder, with the method including the steps of: receiving a transmission including data and an error code; reading the error code; and successively comparing the error code to portions of the data using a plurality of formula, until at least one the comparisons matches, meaning the data is error-free, or all of the comparisons fail, meaning the data is erroneous.
Another object of the present invention is to provide a method of detecting errors in data received by a multi-mode vocoder, with the method including the steps of: reading portions of the data identifying a mode and an error code; and comparing the error code to portions of the data using a formula dictated by the mode, wherein if the comparison matches the data is deemed error-free and otherwise the data is deemed erroneous.
Yet another object of the present invention is to provide a method of forming data for transmission by a multi-mode vocoder, with the method including the steps of: analyzing an input signal of the multi-mode vocoder to determine a mode of the multi-mode vocoder; processing the input signal, in accordance with the mode, to form data; forming an error code by applying a formula to a portion of the data, with the formula being selected in accordance with the mode; and attaching the error code to the data.