1. Field of the Invention
The present invention relates to a waveform reproduction apparatus with which a waveform that has been compressed or expanded in the direction of the temporal axis is reproduced.
2. Description of the Related Art
For some time, waveform reproduction apparatuses with which a waveform that has been compressed or expanded in the direction of the temporal axis is reproduced have been known. For these waveform reproduction apparatuses, a number of formats have been proposed. Here an explanation will be given first regarding a waveform reproduction apparatus that uses a cross-fade format.
FIG. 12 is an explanatory diagram of a cross-fade format in which a musical tone is compressed or expanded in the direction of the temporal axis.
In the waveform reproduction apparatus that uses the cross-fade format, the waveform data that express the waveform of the musical tone are stored in a RAM that is not shown in the diagram. The waveform data that have been stored in the RAM are read out and, as is shown in FIG. 12(a), the waveform data in a specified segment (known as the "opened segment") are jump read and the waveform is compressed or, as is shown in FIG. 12(b), the waveform data in a specified segment (known as the "repeated segment") are repeated and read out and the waveform is expanded. By means of carrying out this action, it is possible to restrain the changes in the pitch with a waveform that has been compressed or expanded and to preserve the pitch of the musical tone. In addition, with the cross-fade format, since noise that is generated in the vicinity of the discontinuous areas of the links between a particular segment and the segments that adjoin that segment is suppressed, it is possible to carry out cross-fade processing in the vicinity of the discontinuous areas.
Here, the meaning of cross-fade processing is processing in which, by means of gradually increasing the amplitude of a waveform that has begun to be read out anew (this is made the later waveform) together with the gradual reduction of the amplitude of the waveform that has been read out up to that point (this is made the head waveform), a transition is made smoothly from the head waveform to the later waveform.
However, with this cross-fade format, since the waveforms that represent musical tone waveforms that are continuous are jump read out or repetitively read out directly, even though cross-fade processing is carried out, there is a problem in that fluctuations or ripples are produced in the waveform that has been compressed or expanded due to such things as a shift in the phase.
In order to solve this problem, a waveform reproduction apparatus called a phase vocoder has been presented. Below, an explanation will be given in regular sequence regarding this phase vocoder.
With the phase vocoder, the original waveform, which expresses the original musical tone prior to carrying out compression or expansion is input. The phase vocoder divides the original waveform that has been input into a multiple number of frequency bands.
FIG. 13 is a diagram that shows the multiple number of frequency bands that have been divided by a phase vocoder.
The original waveform that has been input is divided into a multiple number (here, there are 100) of frequency bands (band 0, 1, . . . k, . . . , p, . . . , 99) which have the center frequencies .omega.0, .omega.1, . . . , .omega.k, . . . , .omega.p, . . . , .omega.99 that are respectively the integer multiple frequencies that represent the fundamental frequency and the harmonics of the fundamental frequency including the second harmonic, third harmonic etc. In addition, this phase vocoder, for each waveform component of the respective multiple number of frequency bands that have been divided, extracts the frequency data and the amplitude data of each of the waveform components that represent the frequencies that change in order together with the passage of time (known as the instantaneous frequency) and the amplitudes that change in order together with the passage of time. The frequency data and the amplitude data that have been extracted in this manner are stored in the memory.
At the time of the reproduction of the waveform, the temporal change rates are adjusted for the frequencies and amplitudes that are expressed by the frequency data and the amplitude data that have been extracted in each frequency band.
FIG. 14 is a schematic diagram that shows the aspects of the frequency and amplitude temporal change rates that have been adjusted by the phase vocoder.
In FIG. 14(a), the amplitude envelope and the frequency envelope that are expressed by the amplitude data and the frequency data that change together with the passage of time in a certain single frequency band are shown. As is shown in FIG. 14(b), the amplitude data and the frequency data are corrected by the adjustment of the temporal change rate for the frequency and the amplitude in accordance with the degree to which expansion or compression are carried out and the envelope is expanded or, as is shown in FIG. 14(c), the amplitude data and the frequency data are culled out and the compression of the envelopes is carried out. By doing it in this manner, after the amplitude envelopes and the frequency envelopes have been adjusted for each frequency band, the cosine waves that have been finely adjusted by an oscillator with which the fine adjustment of the frequency is possible in accordance with the frequency envelope for the center frequency of each of the frequency bands together with the passage of time are obtained. The amplitudes of the cosine waves are finely adjusted in accordance with the amplitude envelopes together with the passage of time and, in addition, in this phase vocoder, all of these waveforms that have been reproduced are combined. In this manner, a reproduced waveform in which the original waveform that has been input has been compressed or expanded in the direction of the temporal axis is obtained.
Since the phase vocoder that has been discussed above is one in which the original waveform is divided into a multiple number of frequency bands, the temporal change rates of the frequencies and the amplitudes that change together with the passage of time are adjusted for each of multiple number of frequency bands that have been divided and, by means of the reproduction of the time conversions for the frequencies and the amplitudes following adjustment, a reproduced waveform in which the original waveform has been compressed or expanded in the direction of the temporal axis is obtained, compared to the case, as in the waveform reproduction apparatus that uses the cross-fade format, in which the waveform data that express the original waveform are themselves directly jump read out or repetitively read out, noise and fluctuations due to such things as a shift in the phase are reduced.
However, in this phase vocoder, with such things as voices and brass where the period of the waveform is long or the waveforms of chords, if the expansion and compression rate, which represents the proportion of compression or expansion, is varied greatly from 1.0 for both compression and expansion, there is a breakdown of the harmonic relationships of the musical tones that are expressed by the waveforms that have been compressed or expanded in the direction of the temporal axis. A detailed explanation of this phenomenon will be given below.
In the case of the phase vocoder discussed above, in order to provide a theoretical description, it was explained to the effect that the original waveform that has been input is, as is shown in FIG. 12, divided into a frequency band that contains the fundamental frequency, a frequency band that contains only a frequency that is twice the fundamental frequency etc. and frequency bands that contain one each only from among the multiple number of frequency components that comprise the original waveform in a single frequency band. However, in the case of this kind of method of division, the requirement is produced for a division into an extremely large number of frequency bands, an extremely large circuit becomes necessary or the time needed for the operations becomes extremely long and it is not pragmatic. Therefore, here, the division of the frequency bands such that a multiple number of frequency components that comprise the original waveform are contained in a single frequency band is considered.
FIG. 15 is a diagram that shows a multiple number of frequency bands and FIG. 16 is a diagram that shows the shape of the pulse stream form original waveform prior to the division into the multiple number of frequency bands that are shown in FIG. 15. In addition, FIG. 17 is a diagram that shows the waveform in a single frequency band from among the multiple number of frequency bands that are shown in FIG. 15.
Here, as is shown in FIG. 16, the original waveform that is input into the phase vocoder comprises a periodic pulse stream that has a comparatively long period. The number of band divisions that are shown in FIG. 15 is smaller than the number of band division that are shown in FIG. 13 and, consequently, the bandwidths for each individual frequency band are wide. Because of this, as is shown in FIG. 15, in, for example, band k, which is one divided band, a multiple number of frequencies which are integer multiples of the fundamental frequency that corresponds to the fundamental period exist that represent a multiple number of adjoining harmonics. The waveform in this band k is the waveform that is shown by the solid line in FIG. 17 and, as is shown by the broken like that represents the envelope, is a waveform that is amplitude modulated at the fundamental period T.
FIG. 18 and FIG. 19 are diagrams that show the aspects of the waveform components in band k that is shown in FIG. 17 in which the temporal change rates are adjusted so that the amplitude and the frequency change slowly. In addition, FIG. 20 is a diagram that shows the waveforms in band k after the temporal change rates of the amplitude and the frequency have been adjusted so that they are slow.
The broken lines a and b that are shown in FIG. 18 and FIG. 19 are the envelopes prior to the adjustment of the temporal change rates of the amplitude and the frequency in band k. In adjusting the temporal change rates of the amplitude and the frequency in band k so they are slow, the amplitude data and the frequency data of each envelope that is shown by the broken lines a and b at each sampling point are interpolated uniformly in the direction of the temporal axis and are expanded as is shown by the solid lines A and B. In this manner, the waveform that is shown in FIG. 20 in which the temporal change rates of the amplitude and the frequency of band k are adjusted so that they are slow is obtained. Here, the fundamental period T' of the waveform that is shown in FIG. 20 is longer than the fundamental period of the waveform T that is shown in FIG. 17. When these kinds of waveforms are reproduced for each band and combined and a waveform that has been expanded in the direction of the temporal axis is obtained, there is a problem in that the harmonic relationships of the original waveform are lost and the sound quality of the musical tone is lowered. In order to avoid that, it is necessary that the original waveform that is input, as is shown in FIG. 13, be divided into many frequency bands in which the fundamental frequency and the frequencies that are integer multiples of the fundamental frequency are the center frequencies. However, when they are divided into a large number of frequency bands in this manner, as was discussed before, the amount of processing in the phase vocoder swells, the processing time becomes longer together with an increase in the size of the circuit and, consequently, as a practical matter, the realization of the system becomes difficult.
In addition, with the phase vocoder of the past that has been described above, the reproduction of the original sound (hereafter, referred to as "one-to-one reproduction") is carried out again and again. In that case, the temporal change rate of the frequency and the amplitude and the pitch data are adjusted so that neither compression nor expansion in the direction of the temporal axis is carried out for each of the multiple number of frequency bands of the original waveform that have been divided and one-to-one reproduction can be carried out. However, the phase data are not taken into consideration. Because of this, with one-to-one reproduction, waveforms are reproduced that have phases that are different from the phase of the waveform that expresses the original sound and, consequently, there are problems such as the fact that the tone quality is degraded and the orientation of the stereo signal is lost.