1. Field of the Invention
This invention relates to time-scale modification methods and apparatuses that perform time-scale modification on digital signals, which are modified without being changed in original pitches with respect to time scale in accordance with desired time-scale modification factors. Particularly, this invention relates to time-scale modification of rhythm source signals.
This application is based on Patent Application No. Hei 11-126349 filed in Japan.
2. Description of the Related Art
Normally, time-scale modification techniques are effected to perform compression and expansion on digital audio signals with respect to time, wherein the digital audio signals are not changed in pitches. Those techniques are used in a variety of fields such as in so-called "scale adjustment" in which an overall recording time of digital audio signals being recorded is adjusted to a prescribed time and "tempo modification" used by Karaoke apparatuses, for example. Conventionally, engineers and scientists propose various examples of time-scale modification techniques. For example, Japanese Unexamined Patent Publication No. Hei 10-282963 teaches a cut-and-splice method in time-scale modification processing. In addition, an example of a time-scale modification algorithm is taught by the paper entitled "Time-Scale Modification Algorithm for Speech by Use of Pointer Interval Control Overlap and Add (PICOLA) and Its Evaluation", which is written by Morita and Itakura on pp. 149-150 of monographs 1-4-14 issued for the autumn meeting of Japan Acoustics Engineering Society in October of 1986.
In general, the cut-and-splice method is used for time-scale modification processing to perform compression or expansion on signal waveforms (or envelopes) in accordance with a designated time-scale modification factor (e.g., compression factor or expansion factor), as follows:
Waveforms are divided into and cut to segments, regardless of correlation therebetween. Then, the cut segments of the waveforms are spliced together to achieve the time-scale modification in accordance with the designated time-scale modification factor. Herein, discontinuity is caused to occur at joints by which the cut segments of the waveforms are spliced together. To reduce the discontinuity, a cross-fade process is effected on the joints to smoothly connect the joints of frames. Intervals of distance (referred to as "cut intervals") by which the waveforms are cut to segments are set such that it is difficult for listeners to sense echoes or sound repetition given human auditory capabilities. For example, the cut intervals are set at 60 millisecond or so. The aforementioned publication teaches a splendid method in which cut lengths of waveforms are determined in synchronization with speech timing information. As compared with general methods, the aforementioned method is advantageous in that variations in sound quality are relatively small at joints of waveform segments being spliced together because the joints emerge by the same period of rhythm as that of the original waveforms.
According to the aforementioned PICOLA method, two segments are extracted from a waveform of an original audio signal. Herein, the two segments each having the same length are arranged to adjoin each other on the waveform with highest correlation therebetween. Signals of those segments are subjected to duplicate addition to produce a specific signal, which is substituted for the original two segments or which is inserted between them. Thus, it is possible to shorten or extend an overall time sustaining the waveform. This method is advantageous in that connection between waveform segments can be made smooth as compared with the cut-and-splice method. Particularly, this method enables high-quality time-scale modification on highly-pitch-dependent sound sources that produce speech signals, musical tone signals of monophonic musical instruments and the like.
In general, the conventional cut-and-splice method has merits in which appropriate sound qualities are expected with respect to many types of sound sources. In the case of rhythm sources, however, it suffers from noticeable deterioration of sound quality such as "double beat" and "disorder in rhythm". The aforementioned publication teaches the cut-and-splice method which is effected in synchronization with the rhythm of the original waveform. In some cases, two attacks are included in each of the segments which are cut from original waveforms. When expanding the waveforms consisting of the cut segments being spliced together with respect to time, a double-beat phenomenon is caused to occur. In contrast, the PICOLA method does not cause such a double-beat phenomenon in principle thereof because time-scale modification is performed in connection with time correlation of waveforms. However, the PICOLA method does not at all compensate for attack positions on waveforms being reproduced by time-scale modification. This causes a rhythm deviation to occur with ease.