1. Field of the Invention
This invention relates to a time-axis compression/expansion method and apparatus for performing time-axis compression/expansion on original digital signals at a desired compression/expansion rate without changing the pitch of the original digital signals, and more particularly to a time-axis compression/expansion method and apparatus of this kind which is suitable for performing time-axis compression/expansion on a multitrack signal.
2. Prior Art
The time-axis compression/expansion technique for time-axis compressing or time-axis expanding a digital audio signal without changing the pitch of the same is utilized e.g. for so-called xe2x80x9ctime length adjustmentxe2x80x9d for adjusting a total recording time period over which the digital audio signal is to be recorded to a predetermined time period, tempo conversion in a karaoke apparatus or the like, and so forth. Conventionally, this kind of time-axis compression/expansion technique includes a cut-and-splice method (as disclosed e.g. in Japanese Laid-Open Patent Publication (Kokai) No. 10-282963), an overlap-add method based on pointer shift amount control (Morita and Itakura, xe2x80x9cExpansion/Compression of Sound in Time Product by Using Overlap-Add Method Based on Point Shift Amount Control and Its Evaluationxe2x80x9d, Lectures at the Autumn Conference of the Acoustical Society of Japan Vol. 1-4-14, October, 1986), etc.
Time-axis compression/expansion processing by a general cut-and-splice method is performed such that waveform segments of an original audio signal are cut out without considering correlation between the waveform segments and then the cut-out waveform segments are spliced together to thereby effect compression/expansion based on a specified compression/expansion rate. According to this method, discontinuities can occur in spliced portions of the cut-out waveform segments, and therefore cross-fading is carried out to smooth the spliced portions of the cut-out waveform segments. The time interval of the waveform cutout is set to such a time period that the human ears cannot sense an echo or doubling of sounds, e.g. approximately 60 msec. Particularly, according to the method disclosed in Japanese Laid-Open Patent Publication (Kokai) No. 10-282963, the cutout length or length of the cutout waveform segment is determined in synchronism with sound timing information. This method is distinguished from other conventional methods in that spliced portions appear at the same repetition period as that of the rhythm of the original waveform, so that tone changes at the spliced portions cannot be easily perceived.
On the other hand, the overlap-add method based on pointer shift amount control is performed such that two adjacent segments of the original audio signal most closely correlated in waveform and equal in length to each other are extracted, and the two signal segments are overlapped or added together. Then, the two original signal segments are replaced by a new signal segment obtained by the overlapping/addition, or the new signal segment is inserted between the two original signal segments, whereby the total time of the original audio signal is reduced or increased. This method enables smoother splicing of waveforms than the cut-and-splice method. Particularly, this method can achieve higher-quality time-axis compression/expansion of pitch-based sound source signals, such as voice signals and sound signals generated by monophonous musical instruments.
However, according to the conventional general cut-and-splice method, although it can provide a certain level of or higher sound quality irrespective of the kind of a signal to be processed, tone changes at the spliced portions of waveforms can be easily perceived depending on the cut-out positions which are determined independently of the waveforms, and particularly in a rhythm sound source, it is likely that very conspicuous sound quality degradation occurs, such as repeated generation of a tone and deviation in rhythm. Further, in a multitrack sound source having a plurality of tracks including a vocal track, a piano track, and a rhythm track, if the individual tracks are separately time-axis expanded or compressed, there can occur differences in tone generation timing between the tracks.
Further, according to the method disclosed in Japanese Laid-Open Publication (Kokai) No. 10-282963, which carries out the cut-and-splice processing in synchronism with the rhythm of the original waveform, two attacks can be included in one waveform segment obtained by cutting out a waveform for time-axis expansion, which results in repeated generation of a tone, i.e. a tone is generated twice. On the other hand, the overlap-add method based on pointer shift amount control is considered to be free from such repeated generation of a tone in principle, since the time-axis compression/expansion is carried out by checking the time correlation between adjacent waveform segments. However, this method does not ensure that the correlation in attack position can be maintained between before the time-axis compression or expansion and after the same, so that a deviation in rhythm is likely to occur.
It is an object of the present invention to provide a time-axis compression/expansion method and apparatus for multitrack signals, which is capable of performing time-axis compression/expansion on a multitrack signal in such an appropriate manner as to prevent a degradation in the sound quality of a sound generated through a multichannel reproduction or a sound generated through reproduction of a musical tone signal obtained by mix-down.
To attain the above object, according to a first aspect of the present invention, there is provided a time-axis compression/expansion method of time-axis compressing/expanding a multitrack sound source signal comprising a plurality of track sound source signals including a rhythm track sound source signal, comprising the steps of detecting positions of attacks of the rhythm track sound source signal of the plurality of track sound source signals, subjecting portions of the rhythm track sound source signal between the detected positions of attacks to a first time-axis compression/expansion process, and subjecting other track sound source signals of the plurality of track sound source signals than the rhythm track sound source signal to a second time-axis compression/expansion process, based on the detected positions of attacks.
Preferably, the first time-axis compression/expansion process is carried out on portions of the rhythm sound source signal other than the detected positions of attacks and portions proximate thereto, so as to smoothly join opposite ends of each of the portions of the rhythm sound source signal that are time-axis compressed/expanded to portions of the rhythm sound source signal that are not time-axis compressed/expanded, and the second time-axis compression/expansion process is carried out on the other track sound source signals such that joined portions of each of the other track sound source signals that are time-axis compressed/expanded synchronize with the detected positions of attacks.
In a preferred embodiment of the first aspect, the first time-axis compression/expansion process comprises determining a segment length of two adjacent waveforms of the rhythm track sound source signal between the detected positions of attacks, which show highest similarity to each other, superposing two adjacent waveforms having a basic period determined by the segment length upon each other, and replacing the two adjacent waveforms by the resulting superposed waveform or inserting the resulting superposed waveform between the two adjacent waveforms.
To attain the above object, according to a second aspect of the present invention, there is provided a time-axis compression/expansion apparatus for time-axis compressing/expanding a multitrack sound source signal comprising a plurality of track sound source signals including a rhythm track sound source signal, comprising an attack position detecting device that detects positions of attacks of the rhythm track sound source signal of the plurality of track sound source signals, a first time-axis compression/expansion processing device that subjects portions of the rhythm track sound source signal between the detected positions of attacks to a first time-axis compression/expansion process, and a second time-axis compression/expansion processing device that subjects other track sound source signals of the plurality of track sound source signals than the rhythm track sound source signal to a second time-axis compression/expansion process, based on the detected positions of attacks.
To attain the above object, according to a third aspect of the present invention, there is provided a time-axis compression/expansion method of time-axis compressing/expanding a multitrack sound source signal comprising a plurality of track sound source signals including a rhythm track sound source signal, comprising the steps of detecting positions of attacks of the rhythm track sound source signal of the plurality of track sound source signals, and time-axis compressing/expanding portions of the rhythm track sound source signal between the detected positions of attacks at a predetermined designated compression/expansion ratio without changing a pitch thereof.
Preferably, the time-axis compression/expansion process is carried out on portions of the rhythm sound source signal other than the detected positions of attacks and portions proximate thereto, so as to smoothly join opposite ends of each of the portions of the rhythm sound source signal that are time-axis compressed/expanded to portions of the rhythm sound source signal that are not time-axis compressed/expanded.
In a preferred embodiment of the third aspect, the time-axis compressing/expanding step comprises determining a segment length of two adjacent waveforms of the rhythm track sound source signal between the detected positions of attacks, which show highest similarity to each other, superposing two adjacent waveforms having, a basic period determined by the segment length upon each other, and replacing the two adjacent waveforms by the resulting superposed waveform or inserting the resulting superposed waveform between the two adjacent waveforms.
To attain the above object, according to a fourth aspect of the present invention, there is provided a storage medium storing a program which can be executed by a computer, for realizing a time-axis compression/expansion method of time-axis compressing/expanding a multitrack signal comprising a plurality of track sound source signals including a rhythm track sound source signal, the program comprising a module for detecting positions of attacks of the rhythm track sound source signal of the plurality of track sound source signals, a module for subjecting portions of the rhythm track sound source signal between the detected positions of attacks to a first time-axis compression/expansion process, and a module for subjecting other track sound source signals of the plurality of track sound source signals than the rhythm track sound source signal to a second time-axis compression/expansion process, based on the detected position of attacks.
To attain the above object, according to a fifth aspect of the present invention, there is provided a storage medium storing a program which can be executed by a computer, for realizing a time-axis compression/expansion method of time-axis compressing/expanding a multitrack signal comprising a plurality of track sound source signals including a rhythm track sound source signal, the program comprising a module for detecting positions of attacks of the rhythm track sound source signal of the plurality of track sound source signals, and a module for time-axis compressing/expanding portions of the rhythm track sound source signal between the detected positions of attacks without changing a pitch thereof and at a predetermined designated compression/expansion rate.
According to the present invention, attack positions of a rhythm track sound source signal of multitrack sound source signals are detected, and portions of the rhythm track sound source signal between the detected attack positions are subjected to time-axis compression or expansion. As a result, a change in the tone at a joint between waveforms joined together by a cross-fading process, for example, cannot be easily perceived by virtue of the auditory sense masking effect due to the signal characteristic that the signal power of attack positions of the rhythm track sound source signal is particularly large. Further, since the interval between the attack positions is also compressed or expanded at the compression or expansion rate, the relationship between the attack positions before the compression or expansion can be completely maintained even after the compression or expansion, thus providing a high-quality sound without any change in the tone being perceived, as is distinct from the conventional cut-and-spliced method. Moreover, since the other track sound source signals of the multitrack sound source signal than the rhythm track sound source are also subjected to time-axis compression/expansion based on the detected attack positions, a high-quality sound reproduction can be achieved without a change being perceived in the tone of a sound generated through a multichannel reproduction or a sound generated through reproduction of a musical tone signal obtained by mix-down, that is conventionally caused by the time-axis compression/expansion.
The above and other objects, features, and advantages of the invention will become apparent from the following detailed description taken in conjunction with the accompanying drawings.