The invention relates to the now very popular field of karaoke entertaining. In karaoke a (usually amateur) singer performs live in front of an audience with background music. One of the challenges of this activity is to come up with the background music, i.e. get rid of the original singer""s voice to retain only the instruments so the amateur singer""s voice can replace that of the original singer. A very inexpensive (but somewhat unsophisticated) way in which this can be achieved consists of using a stereo recording and making the assumption (usually true) that the voice is panned in the center (i.e. that the voice was recorded in mono and added to the left and right channels with equal level). In that case the voice can be significantly reduced by subtracting the left channel from the right channel, resulting in a mono recording from which the voice is nearly absent (because stereo reverberation is usually added after the mix a faint reverberated version of the voice is left in the difference signal). There are several drawbacks to this technique:
1) The output signal is always monophonic. In other words it is not possible using this standard technique to recover a stereo signal from which the voice has been removed.
2) More often than not, other instruments are also panned in the center (bass guitar, bass drum, horns and so on), and the standard technique will also remove them, which is undesirable.
The standard method does not allow extracting or amplifying the voice in the original recording: it is sometimes very useful to be able to remove the background instruments from the original recording and retain only the voice (for example, to change the mixing level of the voice or to aid a pitch-extraction system targeted at the voice).
According to one aspect of the present invention, a phase-vocoder removes the voice or the background instruments from a stereo recording while retaining a stereo output signal. Furthermore, because of the frequency-domain nature of the phase-vocoder, it is possible to more effectively discriminate, based on their frequency contents, the voice from other instruments also panned in the center.
According to a further aspect of the invention, peak frequencies are determined where the magnitude of the frequency domain spectra is at a maximum.
According to another aspect of the invention, a difference spectra is derived from the frequency domain spectra of the left and right stereo channels at the peak frequencies. An attenuating gain factor for each peak frequency is then calculated which is a function of the magnitude of the difference spectra at the peak frequency. For frequencies of voice signals, or other signals panned to center, the magnitude of difference spectra will be much less than that of the left or right channels.
According to another aspect of the invention, a modified spectra is derived by multiplying the magnitude of the frequency domain spectra by the attenuating gain factor at each peak frequency. The magnitude of the modified spectra at frequencies for voice, or other signals panned to center, will be small.
According to another aspect of the invention, the attenuation gain is set to unity for frequency components outside the voice range so that non-voice music panned to center is not attenuated.
According to another aspect of the invention, regions of influence are defined about each peak frequency. The magnitude of the frequency spectra within each region of influence is multiplied by the gain factor for the peak frequency.
According to another aspect of the invention, frequencies of voice, or of other signals panned to center, are amplified by utilizing an amplifying gain factor inversely proportional to the magnitude of the gain factor at each peak frequency. For example, the amplifying gain factor can be set equal to the difference of one and the attenuating gain factor.
Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.