Time stretching is a process of expanding and compressing only a time axis of an audio waveform without changing a pitch thereof. Pitch shifting is a process of changing only the pitch without changing the time axis. There is a so-called vocoder method as a heretofore known audio waveform processing for performing the time stretching and the pitch shifting (refer to Patent Document 1 for instance). This method analyzes a frequency of an inputted audio waveform, compresses or expands the time axis on the time stretching, and scales the frequency of an outputted waveform and then adds each frequency component on the pitch shifting.
In the case of a conventional vocoder methods there is a great change in a phase between an audio input waveform and a time-stretched and/or pitch-shifted waveform. FIGS. 7A and 7B show the change in the phase generated when time-stretching a certain 2-channel stereo audio waveform as an example. A horizontal axis of a graph represents the time axis, and a vertical axis represents the phase of the frequency component. FIG. 7A shows phase changes of components A and B in a frequency band having two channels obtained as a result of frequency analysis of the audio input waveform. FIG. 7B shows phases of A1 and B1 corresponding to A and B obtained when the waveform of FIG. 7A is time-compressed to ½ by the vocoder method. The time axis becomes ½ times, and the vertical axis representing the phase also becomes ½ times.
Here, attention is focused on time T before the stretch process and time T1 (=T/2) after the time compression. In the graph of FIG. 7A before the process, a phase difference between A and B at the time T is 2π, and hence the phase difference is 0 if expressed as −π to π. The components A and B undergo a transition with the phase difference of 0 even after the time T. The phase difference between A1 and B1 at the time T1 after the time compression is π, and A1 and B1 undergo a transition with the phase difference π even after the time T1. Thus, the phase relation between A1 and B1 has apparently changed from that of A and B before the time compression.
As is evident from the above description, the vocoder method expands and compresses the time axis so that a lag or a lead of the phase occurs by the amount of expansion and compression. This also applies to the pitch shifting. A phase change amount is different among the frequency components having undergone the frequency analysis, and is also different among the channels in the case of a stereo audio. For this reason, there arises an auditory sense of discomfort due to, for example, mutual cancellation of sounds or a lack of feeling of normalcy of a stereo sound. Therefore, the time stretching and the pitch shifting of high quality cannot be realized.
The techniques for improving the vocoder method and improving sound quality have also been proposed. For instance, Patent Document 1 discloses an audio waveform device wherein attention is focused on a pre-echo generated on performing band division in an attack portion, in which a level of the audio waveform greatly changes, and the phase is reset at the beginning of a section of the pre-echo.
Patent Document 1: Japanese Patent Application Laid-Open No. 2001-117595