1. Field of the Invention
The present invention relates to a convolution method and convolution device for convoluting into an audio signal a head-related transfer function (hereafter abbreviated to “HRTF”) for enabling a listener to hear a sound source situated in front or the like of the listener, during acoustic reproduction with an electric-acoustic unit such as an acoustic reproduction driver of headphones for example, which is disposed near the ears of the listener.
2. Description of the Related Art
In a case of the listener wearing the headphones on the head for example, and listening to acoustically reproduced signals with both ears, if the audio signals reproduced at the headphones are commonly-employed audio signals supplied to speakers disposed to the left and right in front of the listener, the so-called lateralization phenomenon, wherein the reproduced sound image stays within the head of the listener, occurs.
A technique called virtual sound image localization is disclosed in WO95/13690 Publication and Japanese Unexamined Patent Application Publication No. 03-214897, for example, as having solved this problem of the lateralization phenomenon. This virtual sound image localization enables the sound image to be reproduced (virtually localized in the relevant position) such that when reproduced with a headphone or the like, the sound image is reproduced as if there were a sound source, e.g., speakers in a predetermined perceived position, such as the left and right in front of the listener, and is realized as described below.
FIG. 30 is a diagram for describing a technique of virtual sound image localization in a case of reproducing two-channel stereo signals of left and right with two-channel stereo headphones, for example.
As shown in FIG. 30, at a position nearby both ears of the listener regarding which placement of two acoustic reproduction drivers such as two-channel stereo headphones for example (an example of an electro-acoustic conversion unit) is assumed, microphones (an example of an acousto-electric conversion unit) ML and MR are disposed, and also speakers SPL and SPR are disposed at positions at which virtual sound image localization is desired.
In a state where a dummy head 1 (alternatively, this may be a human, the listener himself/herself) is present, an acoustic reproduction of an impulse for example, is performed at one channel, the left channel speaker SPL for example, and the impulse emitted by that reproduction is picked up with each of the microphones ML and MR and an HRTF for the left channel is measured. In the case of this example, the HRTF is measured as an impulse response.
In this case, the impulse response serving as the left channel HRTF includes, as shown in FIG. 30, an impulse response HLd of the sound waves from the left channel speaker SPL picked up with the microphone ML (hereinafter, referred to as “impulse response of left primary component”), and an impulse response HLc of the sound waves from the left channel speaker SPL picked up with the microphone MR (hereinafter, referred to as “impulse response of left crosstalk component”).
Next, an acoustic reproduction of an impulse is performed at the right channel speaker SPR in the same way, and the impulse emitted by that reproduction is picked up with each of the microphones ML and MR and an HRTF for the right channel, i.e., the HRTF of the right channel, is measured as an impulse response.
In this case, the impulse response serving as the right channel HRTF includes an impulse response HRd of the sound waves from the right channel speaker SPR picked up with the microphone MR (hereinafter, referred to as “impulse response of right primary component”), and an impulse response HRc of the sound waves from the right channel speaker SPR picked up with the microphone ML (hereinafter, referred to as “impulse response of right crosstalk component”).
The impulse responses for the HRTF of the left channel and the HRTF of the right channel are convoluted, as they are, with the audio signals supplied to the acoustic reproduction drivers for the left and right channels of the headphones, respectively. That is to say, the impulse response of left primary component and impulse response of left crosstalk component, serving as the left channel HRTF obtained by measurement, are convoluted, as they are, with the left signal audio signals, and the impulse response of right primary component and impulse response of right crosstalk component, serving as the right channel HRTF obtained by measurement, are convoluted, as they are, with the right signal audio signals.
This enables sound image localization (virtual sound image localization) such that sound is perceived to be just as if it were being reproduced from speakers disposed to the left and right in front of the listener in the case or two-channel stereo audio of left and right for example, even though the acoustic reproduction is nearby the ears of the listener.
A case of two channels has been described above, but with a case of three or more channels, this can be performed in the same way by disposing speakers at the virtual sound image localization positions for each of the channels, reproducing impulses for example, measuring the HRTF for each channel, and convolute impulse responses of the HRTFs obtained by measurement as to the audio signals supplied to the drivers for the acoustic reproduction by the two channels, left and right, of the headphones.