1. Field of the Invention
The present invention relates to an apparatus, a computer program product, and a method for processing acoustical-signal, by which time compression and time expansion of multichannel acoustical signals is executed.
2. Description of the Related Art
Conventionally, a desired companding ratio has been realized by extracting feature data such as a fundamental frequency from an input signal, and by inserting and deleting a signal with an adaptive time width which is decided based on the obtained feature data, when the time length of an acoustical signal is changed, for example, in speech-rate conversion. For example, a “Pointer Interval Controlled OverLap and Add” (PICOLA) method described by MORITA Naotaka and ITAKURA Fumitada, “Time companding of voices, using an auto-correlation function”, Proc. of the Autumn Meeting of the Acoustical Society of Japan, 3-1-2, p. 149-150, October, 1986 is a typical time companding method. In this PICOLA, the time companding is processed by extracting a fundamental frequency from an input signal, and by inserting and deleting waveforms of the obtained fundamental frequency. In Japanese Patent No. 3430968, a waveform is cut out at a position at which waveforms in a crossfade interval are the most similar to each other, and the both ends of the cut waveforms are connected for time companding processing. In the both techniques, companding processing is executed, based on feature data representing a similarity between two intervals which are separated in the time-base direction of an original signal, and time-base compression and time-base expansion processing can be naturally realized without changing musical intervals.
Incidentally, in the case where an acoustical signal to be processed is an acoustical signal of a multichannel type such as a stereo signal and a 5.1 channel signal, feature data such as a fundamental frequency, which are extracted from each channel, are not necessarily the same, as one another when time-base companding is separately executed for each channel, and cause a state in which timing for insertion and deletion of waveforms are different from one another. Thereby, there has been a problem that a phase difference which is not included in the original signal is caused between signals after the processing, and discomfort is felt by audiences.
Then, in the speech-rate conversion of a multichannel acoustical signal, synchronization between the channels is required for keeping sound-source localization by insertion and deletion of waveforms, based on a common feature (common pitch), after extracting the feature (common pitch) common to all channels. Conventional techniques, by which a feature common to all channels (common pitch) is extracted and synchronization between the channels is secured as described above, are for example those described in Japanese Patent No. 2905191, and Japanese Patent No. 3430974. According to these techniques, a feature (common pitch) is extracted from signals combining (adding) all or a part of multichannel acoustical signals. For example, when an input signal is a stereo signal, a feature common to all channels is extracted from (L+R) signals obtained by combining (adding) L channels and R channels.
However, the method, by which a feature common to all channels is extracted from signals combining (adding) multichannel acoustical signals as described above, has a problem that a feature (common pitch) cannot be accurately extracted when there is included a sound having a component of a left channel out of phase with that of a right channel at combining (adding) a plurality of channel signals are combined (added). More particularly, there has been a problem that the both signals cancel each other (the both become 0 in the case of the same amplitude), and the feature (common pitch) cannot be accurately extracted when an L channel and an R channel in a stereo signal have signals in out of phase with each other, and the both signals are combined (added) in the form of (L+R).