In general, audio signals include signals of various frequencies, the human audible frequency ranges from 20 Hz to 20 kHz, and human voices are present in a range of about 200 Hz to 3 kHz. An input audio signal may include components of a high-frequency zone higher than 7 kHz at which human voices are hardly present in addition to a band in which human voices are present. In this way, when a coding method suitable for a narrowband (up to about 4 kHz) is applied to wideband signals or super-wideband signals, there is a problem in that sound quality degrades.
With a recent increase in demands for video calls, video conferences, and the like, techniques of encoding/decoding audio signals, that is, speech signals, so as to be close to actual voices have increasingly attracted attention.
Frequency transform which is one of methods used to encode/decode a speech signal is a method of causing an encoder to frequency-transform a speech signal, transmitting transform coefficients to a decoder, and causing the decoder to inversely frequency-transform the transform coefficients to reconstruct the speech signal.
In the techniques of encoding/decoding a speech signal, a method of encoding predetermined signals in the frequency domain is considered to be superior, but a time delay may occur when transform for encoding a speech signal in the frequency domain is used.
Therefore, there is a need for a method which can prevent the time delay in encoding/decoding a signal and increase a processing rate.