The present invention relates generally to signal processing, and more particularly to techniques for synthesizing time-domain signals by use of non-overlapping inverse Fourier transforms.
Sinusoids are fundamental building blocks used in the synthesis of waveforms for speech, audio, music, and other applications. It is known that a particular time domain signal can be decomposed into a sum of sinusoids, with each sinusoid having a particular amplitude, frequency, and phase. In fact, a time-domain signal can be fully represented by its corresponding frequency-domain spectrum.
In sinusoidal modeling or additive synthesis of speech, audio, or music signal, it is often necessary to synthesize and sum a large number of sinusoids with time-varying amplitude, frequency, and phase parameters. For example, an accurate representation of a low piano note can require over 100 sinusoids. Several techniques currently exist for the synthesis of sinusoids, including) wavetable synthesis and synthesis using overlapping Fourier transforms.
Wavetable synthesis is a popular technique for synthesizing waveforms. A wavetable synthesizer typically stores samples of a limited number of representative waveforms in a read-only memory (ROM) that are later retrieved and manipulated to generate the desired waveform. For example, a music wavetable synthesizer implementing a piano may store a set of representative notes (i.e., eight notes out of eighty-plus possible notes the piano is capable of playing). To synthesize a desired note, one of the representative notes is retrieved from memory, shifted in pitch to match that of the desired note, and converted to a desired output format (e.g., an analog signal). As can be seen, the cost to implement a wavetable synthesizer can be very high when large numbers of sinusoids need to be synthesized. Further, the need to determine and store representative waveforms can limit the use of the wavetable synthesizer to specific applications. Wavetable synthesizer is further described in U.S. Pat. No. 5,809,342.
Synthesis using overlapping inverse Fourier transforms is another technique for synthesizing waveforms. In this technique, the signal to be synthesized is partitioned into overlapping frames, with each frame including a number of samples from preceding and succeeding frames. The overlapping attempts to minimize the amount of discontinuity at the frame boundary. The signal is then synthesized frame by frame. Each frame typically includes a number of sinusoids, with each sinusoid corresponding to a "peak" in the frequency domain. For each frame, a peak is synthesized in the frequency domain for each of the sinusoids. The peaks in the frame are added together and an inverse Fourier transform is calculated to generate a time-domain frame. Consecutive time-domain frames are synthesized in the above-described manner, overlapped with adjacent frames, and added together with these frames. This technique is further described in U.S. Pat. No. 5,401,897.
The use of inverse Fourier transforms that overlap results in additional cost and can generate artifacts that degrade performance. For example, for implementations having fifty percent overlapping, half of the samples in any particular frame is from the preceding frame and the remaining half of the samples is from the succeeding frame. Overlapping the frames thus results in more frames being calculated per second of output signal. Moreover, it has been noted that artifacts can occur in the overlapping regions whenever the frequency of the sinusoids changes from one frame to the next, which commonly occurs. The artifacts include undesirable amplitude modulation that arises from summing sinusoids from adjacent frames having similar, but different frequencies. To counter this undesirable modulation, sweeping sinusoids can be generated such that the frequency of these sinusoids varies linearly (i.e., instead of being constant) within a particular frame or exhibits two sweep rates within one frame. The generation of sweeping sinusoids can significantly complicate the synthesis process and typically requires additional computations.
Thus, techniques that efficiently synthesize time-domain signals with reduced complexity and minimal amounts of artifacts are highly desirable.