1. Field of the Invention
The present invention generally relates to a pitch/tempo converting method and a pitch/tempo converting apparatus for concurrently converting the pitch and tempo of an audio signal such as a music tone signal and a voice signal.
2. Description of Related Art
A cut and splice method is known as a typical pitch conversion technique for use in changing the pitch of a music tone or a voice. For example, as shown in FIG. 9, to lower the pitch of an original audio signal Si, the sample data reading speed or reading rate of sample values of the original audio signal Si is decreased to obtain a converted audio signal So. To raise the pitch of the original audio signal Si, the sample data reading speed is increased. Since the sample values are discrete digital data, a sample value B corresponding to the original sampling point in the converted audio signal So must be calculated from a shifted sample value A by means of linear interpolation or the like as shown in FIG. 10.
The calculated sample data is successively read at an original sampling interval without change, hence the tempo of the original audio signal Si also may change subsidiarily as a consequence of the pitch change. To prevent this from happening, a frame having a predetermined length T is defined as one processing unit as shown in FIG. 9. When the reading speed conversion of a predetermined number of samples has been completed in one frame, the same processing is repeated from a sample point jumped in the original audio signal Si. Consequently, by lowering the pitch while using the frame method, a part of the original audio signal Si is truncated. To raise the pitch, a part of the original audio signal Si is reproduced in duplication to compensate for the truncated part.
In a junction portion between consecutive frames, discontinuity of waveform of the audio signal occurs as shown in FIG. 9. This junction portion is smoothed by cross-fading. In the cross-fading, the reading start point of a frame of a first channel CH1 is shifted from that of another frame of a second channel CH2 by 1/2 of frame period T as shown in FIG. 11. The above-mentioned operations are executed to obtain the two channel audio signals. The two channel audio signals are multiplied by cross-fading coefficients cg1 and cg2, respectively, as shown in FIG. 11. The results of these multiplication operations are added together to smooth the junction of the successive frames.
Tempo conversion is conducted by changing the reproduction speed of a music tone or a voice. The conventional tempo conversion simply changes the read speed of digital sample data of the audio signal. In this simple tempo conversion, the change of the read speed subsidiarily causes a variation of the pitch. To prevent this variation from happening, pitch conversion that cancels the pitch variation of the original pitch must be combined with the tempo conversion. In this case too, interpolation is executed to calculate sample values after the pitch conversion.
When the tempo conversion is executed and the pitch conversion is additionally executed as with "quick reproduction+raised pitch," the pitch conversion is intended for not only correcting the pitch variation due to the tempo conversion but also positively raising the pitch. Therefore, conventionally, the pitch conversion and the tempo conversion are executed separately as shown in FIG. 12. As shown, in a pitch converting module, the read speeds of the two channels are modified based on the adjustive pitch conversion for correcting the pitch variation due to the tempo conversion and based on the net pitch conversion by a designated pitch (steps S21 and S22). Subsequently, interpolation is executed on each of the channels (steps S23 and S24), outputs of which are then cross-faded (step S25) with each other. In a tempo converting module, read speed change processing based on a designated tempo is executed on the pitch-converted data (step S26). Then, the interpolation is executed again in the resultant data (step S27).
In the conventional pitch/tempo conversion, the pitch conversion and the tempo conversion require separate interpolating operations. These two interpolating operations necessarily deteriorate the waveform of the audio signal, thereby lowering the quality of the reproduced audio signal. In addition, the conventional pitch/tempo conversion changes the read speeds separately in the pitch conversion and the tempo conversion. This causes redundant operations of the similar type, thereby presenting a problem of increased processing loads.