1. Field of the Invention
The present invention relates to an audio signal processing device and an audio signal processing method. The present invention relates to an audio signal processing device and an audio signal processing method that perform audio signal processing for enabling audio signals of 2 or more channels such as a multi-channel surround scheme to be acoustically reproduced, for example, by electrical acoustic reproduction means for two channels arranged in a television device. More particularly, the present invention relates to an invention for allowing sound to be listened to as if sound sources were present in previously supposed positions, such as front positions of a listener, when audio signals are acoustically reproduced by electro-acoustic transducing means, such as left and right speakers arranged in a television device.
2. Description of the Related Art
For example, a technique called virtual sound localization is disclosed in Patent Literature 1 (WO95/13690) or Patent Literature 2 (Japanese Patent Laid-open Publication No. 03-214897).
Since the virtual sound localization allows sound to be reproduced as if sound sources, such as speakers, were present in previously supposed positions, such as left and right positions of the front of a listener (a sound image to be virtually localized in the positions) when the sound is reproduced, for example, by left and right speakers arranged in a television device, the virtual sound localization is realized as follows.
FIG. 20 is a diagram illustrating a virtual sound localization technique in a case in which a left and right 2-channel stereo signal is reproduced, for example, by left and right speakers arranged in a television device.
For example, microphones ML and MR are installed in positions near both ears of a listener (measurement point positions), as shown in FIG. 20. Further, speakers SPL and SPR are arranged in positions where virtual sound localization is desired. Here, the speaker is one example of an electro-acoustic transducing unit and the microphone is one example of an acoustic-electric conversion unit.
In a state in which a dummy head 1 (or a person, i.e., a listener) is present, an impulse is first acoustically reproduced by the speaker SPL of one channel, e.g., a left channel. The impulse generated by the acoustic reproduction is picked up by the respective microphones ML and MR to measure a head-related transfer function for the left channel. In the case of this example, the head-related transfer function is measured as an impulse response.
In this case, the impulse response as the head-related transfer function for the left channel includes an impulse response HLd of a sound wave from the left channel speaker SPL picked up by the microphone ML (hereinafter, an impulse response of a left main component), and an impulse response HLc of a sound wave from the left channel speaker SPL picked up by the microphone MR (hereinafter, an impulse response of a left crosstalk component), as shown in FIG. 20.
Next, the impulse is similarly acoustically reproduced by the right channel speaker SPR, and the impulse generated by the reproduction is picked up by the microphones ML and MR. A head-related transfer function for the right channel, i.e., an impulse response for the right channel, is measured.
In this case, the impulse response as the head-related transfer function for the right channel includes an impulse response HRd of a sound wave from the right channel speaker SPR picked up by the microphone MR (hereinafter, referred to as an impulse response of a right main component), and an impulse response HRc of a sound wave from the right channel speaker SPR picked up by the microphone ML (hereinafter, referred to as a an impulse response of a right crosstalk component).
The impulse responses of the head-related transfer functions for the left channel and the right channel obtained by the measurement are directly convoluted with audio signals to be supplied to the left and right speakers arranged in the television device. That is, for the audio signal of the left channel, the impulse response of the left main component and the impulse response of the left crosstalk component, which are the head-related transfer functions for the left channel obtained by the measurement, are directly convoluted. In addition, for the audio signal of the right channel, the impulse response of the right main component and the impulse response of the right crosstalk component, which are the head-related transfer functions for the right channel obtained by the measurement, are directly convoluted.
By doing so, for example, for left and right 2 channel stereo sound, the sound can be localized (virtual sound localization) as if acoustic reproduction were performed by left and right speakers installed in desired positions at the front of the listener despite the acoustic reproduction being performed by the left and right speakers arranged in the television device.
The 2 channels have been described above. However, for multiple channels such as 3 or more channels, similarly, speakers are arranged in virtual sound localization positions of the respective channels to reproduce, for example, an impulse and measure head-related transfer functions for the channels. Impulse responses of the head-related transfer functions obtained by the measurement may be convoluted with audio signals to be supplied to left and right speakers arranged in a television device.
Meanwhile, recently, in acoustic reproduction involved in video reproduction of a digital versatile disc (DVD), a surround scheme for multiple channels, such as 5.1 channels or 7.1 channels, has been used.
Even when an audio signal of the multi surround scheme is acoustically reproduced by left and right speakers arranged in a television device, sound localization according to each channel using the above-described virtual sound localization technique (virtual sound localization) has been proposed.