This invention relates to the field of sound synthesis, specifically synthesis of a wide range of perceptually convincing sounds using parameterized sound models.
Sound synthesis can be applied in a wide variety of systems including, for example, virtual reality (VR), multi-media, computer gaming, and the world wide web. Applications where sound synthesis can be particularly useful include, for example, training, data analysis and auralization, multi-media documentation and instruction, and dynamic sound generation for computer gaming and entertainment.
Pre-Recorded, Pre-Digitized Sounds
Most current VR, multimedia, gaming and software simulation systems utilize pre-recorded, pre-digitized sounds rather than synthesized sounds. Pre-digitized sounds are static and can not be changed in response to user actions or to changes within a simulation environment. Obtaining an application-specific sound sequence can be difficult and can require sophisticated sound editing hardware and software. There can be a 2000:1 ratio of field time to useable digitized sound; in other words, 2000 hours of field and editing time can be required to obtain 1 hour of application specific digitized sound. Creating an acoustically rich virtual environment requires thousands of sounds and variations of those sounds. Thus, obtaining the vast digitized sound library required for rich and compelling acoustic experiences is impractical.
Wavetable synthesis is a pre-digitized sound method that is commonly used in synthesizer keyboards and PC sound cards. See, e.g., Pohlmann, xe2x80x9cThe Shifting Soundscape,xe2x80x9d PC Magazine, Jan. 6, 1998. Sounds are digitized and stored in computer memory. When applications request a particular sound sample, the sound is processed, played back and looped over. This method has the same short-comings as those discussed for pre-digitized sound: sounds are not dynamic in nature, and it is costly to obtain large quantities of digitized sounds.
The alternative to using pre-digitized sound is to synthesize sounds as needed. Unfortunately, there are no sound synthesis methods available which can provide flexible, real-time sound synthesis of a wide variety of sounds. Some existing sound synthesis methods can synthesize a narrow range of sounds, but are not extensible to synthesize a wide variety of sounds.
Physical Modeling Synthesis
Physical characteristics of objects involved can be modeled to synthesize sound. The disadvantage to this approach is that the resulting models are not generalizable to many different types of sounds. In addition, unless very complex physical models are used, perceptually convincing synthesis is not achieved.
Gaver developed a parameterized model based on a simple physical equation for impact, scraping, breaking and bouncing sounds. See, e.g., Gaver, xe2x80x9cUsing and Creating Auditory Icons,xe2x80x9d in G. Kramer (Ed.) Auditory display: Sonification, audification, and auditory interfaces, Reading, Mass., Addison-Wesley, 1994, pp. 417-446. Gaver""s method yielded parameterized models, but did not produce perceptually convincing sounds.
Others have created parameterized synthesis models for impacts based on the physical equations of the objects involved. Doel""s approach produced sounds that were not perceptually convincing, and the method was not generalizable to a wide class of sounds. Doel and Pai, xe2x80x9cSynthesis of Shape Dependent Sounds with Physical Modeling,xe2x80x9d Proceedings of the 1996 International Conference on Auditory Displays, November 4-6, Palo Alto, Calif. Cook""s approach yielded perceptually convincing sounds, but parameterization was difficult and the resulting models were not generalizable. Cook, xe2x80x9cPhysically Informed Sonic Modeling (PhISM): Synthesis of Percussive Sounds,xe2x80x9d Computer Music Journal, 21(3), 1997, pp. 38-49.
The digital waveguide method has been used for developing physical models of string, wind and brass instruments and the human singing voice. See, e.g., Cook, xe2x80x9cSpeech and Synthesis Using Physical Models: Some History and Future Directions,xe2x80x9d Greek Physical Modeling Conference, 1995; Smith, xe2x80x9cPhysical Modeling using Digital Waveguides,xe2x80x9d Computer Music Journal, Vol. 16, No. 4, 1992. The models involved are specific to one type of instrument and are extremely complex. Excellent quality music synthesis is obtained and some high-end synthesizer keyboards have been based on this technique. However, the technique is not extensible to general sound synthesis.
Spectral Synthesis
Other researchers have investigated spectral synthesis using Fourier analysis or Short-Time Fourier Transform (STFT). See, e.g., Freed, Rodet, and Depalle, xe2x80x9cSynthesis and Control of Hundreds of Sinusoidal Partials on a Desktop Computer without Custom Hardware,xe2x80x9d Proceedings of the International Conference on Signal Processing, Applications and Technology (ICSPAT), 1993; Serra, xe2x80x9cSpectral Modeling Synthesis: a Sound Analysis/Synthesis System Based on a Deterministic plus Stochastic Decomposition,xe2x80x9d Computer Music Journal, Vol. 14, No. 4, Winter 1990. Spectral synthesis starts with Fourier analysis of a base sound. Fourier methods, however, do not adequately model the time varying nature of real-world signals. STFTs capture the frequency information for different blocks of time, but the time resolution is limited and fixed by the choice of window size. Furthermore, stochastic components of sounds are often lost with STFT techniques, which reduces the realistic quality of subsequently synthesized sounds.
Freed investigated additive synthesis of sound analyzed with Fourier transforms. Freed""s approach required summation of thousands of sinusoids for producing a single synthesized sound. Generalizable, parameterized sound models were not attained.
Serra used STFTs to analyze musical instrument sounds. To preserve the realistic nature, stochastic components were added back in during synthesis. Serra""s approach is not easily parameterizable or extensible due to limitations of the STFT.
FM synthesis is another spectral approach which combines two or more sinusoidal waves to form more complex waveforms. The sounds synthesized with this method are xe2x80x9celectronicxe2x80x9d and artificial sounding. This method does not synthesize perceptually convincing natural sounds.
Accordingly, there is a need for sound synthesis methods that create perceptually convincing sound models for both pitched and stochastic based sounds, are generalizable to synthesize a broad class of sounds, and can synthesize sound variations in real-time.
The present invention provides a sound synthesis method that can create perceptually convincing sound models that are generalizable to synthesize a broad class of sounds (both pitched and stochastic based sounds), and can synthesize sound variations in real-time. The present method uses wavelet decomposition and synthesis for creating dynamic, parameterized models. The method is based on the spectral properties of a sound and takes the stochastic components of the sound into consideration for creating perceptually convincing synthesized sounds. Wavelet analysis provides a time-based windowing technique with variable-sized windows. Stochastic components are maintained through the analysis process and can be manipulated during parameterization and reconstruction. The result is generalizable sound models and perceptually convincing sound synthesis.
A wavelet decomposition can be used to obtain a wavelet representation of a digitized sound. The wavelet representation can then be parameterized, for example by grouping related wavelet coefficients. The parameterized wavelet representation can then be manipulated to generate a desired synthesized sound. An inverse wavelet transform can construct the synthesized sound, having the desired characteristics, from the parameterized wavelet representation after parameter manipulation. The synthesized sound can then be communicated, for example by generating audio signals or by storing for later use.
Advantages and novel features will become apparent to those skilled in the art upon examination of the following description or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.