1. Field of the Invention
The present invention relates to a speech decoding device, a speech encoding device, a speech decoding method, a speech encoding method, a speech decoding program, and a speech encoding program.
2. Description of the Related Art
Speech encoding for compressing the amount of data of speech signals and audio signals to a few tenths of the original size is an extremely important technique in terms of transmission and accumulation of signals. Examples of speech encoding techniques widely used include code excited linear prediction (CELP) that encodes a signal in a time domain, transform coded excitation (TCX) that encodes a signal in a frequency domain, and “MPEG4 AAC” standardized by “ISO/IEC MPEG”.
As a method for improving the performance of speech codec and enabling high speech quality at a low bit rate, bandwidth extension techniques have become widely used in these days in which a high frequency component is generated using a low frequency component of speech. An exemplary bandwidth extension technique is called a spectral band replication (SBR) used in “MPEG4 AAC”.
In speech encoding, the temporal envelope shape of a decoded signal obtained by decoding a code sequence obtained by encoding an input signal may greatly differ from the temporal envelope shape of the input signal, and such a difference may be perceived as distortions. Also, when the bandwidth extension techniques are used, since a high frequency component is generated by using a signal obtained by encoding and decoding a low frequency component of a speech signal with the speech encoding techniques as described above, the temporal envelope shape of the high frequency component may likewise differ and such a difference may be perceived as distortions.
The method below is a known method for solving this problem (see Patent Literature 1 below). Specifically, in order to generate high frequency component, a high frequency component in an arbitrary time segment is divided into frequency bands. When energy information for each frequency band is calculated and encoded, the energy information for each frequency band is calculated and encoded for respective time segments shorter than the aforementioned time segment. In doing so, with respect to the divided frequency band and the short time segment, the bandwidth of each frequency band and the length of the short time segment can be set flexibly. A decoding device therefore can control energy of a high frequency component for each short time segment in the time direction. That is, the decoding device can control the temporal envelope of a high frequency component for each short time segment.