This invention relates to alteration of frequency components of an input signal spectrum to reduce the perceptibility of a component of the input signal. One application includes changing noise in an audio signal.
Microphones such as those used during commercial broadcasting, motion picture filming and communications, for example, are often susceptible to unwanted sounds made by background objects such as air conditioning systems, propellers, engines, fans, computer disk drives or shutter mechanisms on motion picture cameras, for example. This can result in the unwanted sounds being converted by the microphone into an undesirable portion of the electrical signal produced by the microphone. Undesirable signals can also be introduced through a variety of other sources such as quantization noise and electromagnetic Interference, for example.
Techniques exist for reducing the undesirable portion of a signal (i.e. noise), however, such techniques often result in distortion of the desirable portion of the signal or add audible artifacts to the signal, detracting from the effect of the technique. Consequently, such techniques only produce satisfactory results when the unwanted component of the signal is well below the desired portion of the signal, or in other words, when the signal to noise ratio is relatively high.
Current noise reduction techniques include a spectral subtraction method in which a noise spectrum is subtracted from the spectrum of the input signal. In this technique the processing is done entirely on the spectral magnitude of the input signal and the phase of the input signal is left unchanged. While this technique can offer a perceived reduction of the unwanted signal, it has been found that as the signal-to-noise ratio decreases, audible artifacts tend to become more and more noticeable in the output signal. Such artifacts include musical noise, incomplete or variable cancellation of the noise which can produce modulation of the noise floor, timbral effects and/or loss of frequency components of the signal, missing sounds, loss of low level signal (speech) components, phase distortions, time aliasing, and pre-echoes and post echoes which can result in temporal smearing. For input signals having a relatively high signal-to-noise ratio these artifacts may not be too disturbing and may go unnoticed by the listener. However, at lower signal-to-noise ratios these artifacts are often more perceptually disturbing than the original unwanted signal thus negating the value of the technique.
Each of these artifacts is a result of two underlying assumptions upon which spectral subtraction is based, namely that the spectral magnitude of the noise is equal to the expected value of the noise and that the phase of the signal can be approximated by the phase of the noisy signal.
In addition, spectral subtraction is done with a constant subtraction spectrum (or adjustment frequency spectrum) which typically consists of the average (expected) value of the unwanted signal as measured over a short interval of time. This can result in overprocessing of the signal which further leads to the artifacts listed above. This can be particularly limiting to noise reduction in signals in which the noise in the input signal changes over time, especially if the noise changes in a repetitive way, such as the noise produced by a shutter mechanism of a motion picture camera. In such a case, the periods of the input signal that have a lower noise level will be processed in the same manner as the periods where the noise level is high. As a result, the lower noise periods will be overprocessed and the artifacts will be more noticeable.
Thus there is a need to alter frequency components of an input signal spectrum to reduce the perceptibility of a component of the Input signal, while minimizing artifact production, particularly where the input signal has an undesirable repetitive component.
In accordance with one aspect of the invention, the present invention addresses the above need by providing a method and apparatus for changing the frequency content of an input spectrum, which involves adjusting frequency components of the input spectrum In response to a time varying adjustment frequency spectrum to produce an output frequency spectrum including adjusted frequency components of the input spectrum.
By using a time varying adjustment frequency spectrum to adjust the frequency content of the input spectrum, optimum adjustment frequency spectra can be used during corresponding periods of the input signal such that only the required amount of signal processing is done to adjust the frequency components of the input signal, for each segment of the input signal. This is particularly useful where the undesired component is a repetitive noise, such as sounds caused by a shutter mechanism of a motion picture camera, for example. The noise spectrum for the period during which the shutter is actuated is different than the noise spectrum for the period during which the shutter is inactive.
During periods when the shutter is actuated, the noise component may have a relatively high loudness level and may have a relatively strong high frequency content. When the shutter is inactive, the; noise spectrum may merely consist of noise from a motor in the motion picture camera, for example, and may therefore have a greatly reduced loudness level and may have a greater low frequency content By selectively using one or the other of these spectra for adjustment of the spectrum of the input signal during respective corresponding periods of the input signal, only frequency components of the input signal which correspond to those of the noise during that segment are affected, resulting in processing only frequency components in the input signal which coincide with noise frequency components in existence during that period. Thus, effectively a time varying adjustment spectrum is generated and used to adjust the input spectrum. The number of spectra used to produce such a time varying adjustment spectrum for any one signal will depend upon the number of segments, or frames, into which the input signal is divided. There is no limit on this number.
In addition, the method and apparatus involve dividing the frequency components of the input spectrum into a plurality of frequency bands and applying a separate time varying adjustment spectrum to each band. This allows the adjustment process to be matched to the time-frequency distribution of the noise, such that frequencies of the input signal in some bands may be adjusted more greatly than frequencies in other bands, while the time varying adjustment frequency spectrum provides the best adjustment spectrum for the instant period of the input signal. In other words, the processing is divided in time and frequency so as to match the processing to the particular characteristics of the noise. This allows the noise reduction processing to track the cyclical or repetitive characteristics of the noise. This inherently reduces all formns of audible artifacts since the average amount of processing applied to the input signal is minimized. At time intervals and frequencies where the noise is loudest, more aggressive processing is employed. Elsewhere, less aggressive processing is applied. In general, this allows processing to occur with the best signal to noise ratios in each frequency band, resulting in fewer artifacts being produced.
The method and apparatus may also include a perceptual model. The purpose of the perceptual model is to determine which portions of the unwanted signal are being masked by the desired signal and which are not being masked. Masking is the phenomenon that occurs in the human auditory system by which a signal which would otherwise be audible is rendered inaudible by the presence of another signal. By including a perceptual model in the processing, only the audible portion of the unwanted signal is removed, and thus the overall adjustment of frequencies applied to the input signal is further reduced. As a result, the artifacts that result from adjusting these frequencies are reduced. The perceptual model produces the adjustment frequency spectrum using an input spectrum derived from the input signal, and a reference signal, or reference spectrum. The reduction of artifacts may be achieved with or without the time varying adjustment frequency spectrum, or frequency banding described above.
In accordance with another aspect of the invention, the present invention further addresses the above need by providing a method and apparatus for reducing the perceptibility of a component of an input signal, Involving adjusting frequency components of a first analysis windowed frame of input time samples in response to an adjustment frequency spectrum, to produce an output frame of output time samples and synthesis windowing the output frame to produce a synthesis windowed frame of output time samples representing a time varying output signal having reduced perceptibility of the component.
In one embodiment, there is an overlapping synthesis window having boundaries and a zero-tending taper at the boundaries, which reduces artifacts at the edges of sample frames of time samples taken of the input signal. Preferably, the input samples are windowed by an overlapping analysis window and then a first time-to-frequency domain conversion is performed on the first analysis windowed frame of input samples to produce an input spectrum which can be adjusted by an adjustment processor. In this embodiment, the time-to-frequency domain conversion is done using a Discrete Fourier Transform (DFT) although it will be appreciated that other methods are possible, including other transforms or filter banks, for example.
The overlapping analysis window has the effect of dividing the input signal into overlapping frames for processing. The output signal is then windowed by the overlapping synthesis window prior to an overlap-and-add process. Preferably, the overlapping analysis and overlapping synthesis windows are chosen such that the combination of the overlapping analysis windows and the overlapping synthesis windows has no net effect on the signal. That is, the analysis and synthesis windows, as well as the amount of overlapping are chosen such that, in the absence of any intermediate processing, the output signal is identical to the input signal. To accomplish this the result of multiplying the analysis and synthesis windows and summing across overlapping frames equals unity. However, this may result in errors in the adjustment frequency spectrum generated when the method and apparatus include a perceptual model. If the first analysis windowed frame of samples is used as the input to the perceptual model, then the perceptual model may not have an accurate representation of the input spectrum since the spectrum reaching the perceptual model does not include the effects of the synthesis window. That is, the perceptual model would predict the adjustment frequency spectrum using data that has only been windowed by the analysis window. Therefore, a second overlapping analysis window is employed to produce a second analysis windowed frame of input samples for use by a perceptual model which operates on both the second analysis windowed frame of input samples and a reference frequency spectrum to produce the adjustment frequency spectrum. Thus, an accurate input spectrum can be applied to the perceptual model to permit full consideration of the input spectrum by the perceptual model which results in the full effect of masking to be taken into account by the perceptual model, thereby reducing the amount of processing required on some frequency components and reducing the potential for artifact creation.
The use of the overlapping analysis and overlapping synthesis windows with the perceptual model may be further combined with the above described time varying reference frequency spectrum, in which case the perceptual model produces a time varying adjustment frequency spectrum for use by the adjustment processor to achieve the benefits of sub-framing described above. Furthermore, the input spectrum may be banded as described above to obtain the benefits of matching the time frequency distribution of the unwanted component to the spectral adjustment process.