Linear Predictive Coding (LPC) of speech involves estimating the coefficients of a time varying filter (henceforth called a "synthesis filter") and providing appropriate excitation (input) to that time varying filter. The process is conventionally broken down into two steps known as encoding and decoding.
As shown in FIG. 1, in the encoding step, the original speech signal s is first filtered by pre-filter 10. The pre-filtered speech signal s.sub.p is then analyzed by LPC Analysis block 14 to compute the coefficients of the synthesis filter. Then, an LPC analysis filter 12 is formed, using the same coefficients as the synthesis filter but having an inverse structure. The pre-filtered speech signal s.sub.p is processed by analysis filter 12 to produce a residual output signal u called the "residue". Information about the filter coefficients and the residue is passed to a decoder (not shown) for use in the decoding step.
In the decoding step, a synthesis filter is formed using the coefficients obtained from the encoder. An appropriate excitation signal is applied to the synthesis filter, based on the information about the residue obtained from the encoder. The synthesis filter outputs a synthetic speech signal, which is ideally the closest possible approximation imitation to the original speech signal, s.
This invention pertains to the processing of unvoiced plosives in the residue (i.e. the process steps shown in blocks 20-28 enclosed within the dashed outline portions of FIG. 1). During unvoiced speech, plosives (or stops) in the residue are characterized by sudden variations in energy from one block of speech samples to the next. Prior art linear predictive speech coding techniques have achieved only poor representation of unvoiced plosives. In particular, prior art techniques typically represent unvoiced plosives by interpolating energy variations between relatively few samples spaced relatively far apart. This yields a gradual variation in energy, which does not accurately reflect unvoiced plosives' sudden energy variations. This invention achieves more accurate location and coding of unvoiced plosives in the residue. Information about the location of the start of the sudden energy variation (burst portion of the unvoiced plosive) in the residue is encoded. This enables the decoder to produce a synthetic excitation signal having sudden energy variations during unvoiced plosives, thereby improving the quality of the synthetic speech considerably.