1. Field of the Invention
The present general inventive concept relates to a stereo sound system, and more particularly, to a stereo sound generation apparatus and method of generating virtual sound sources for two-channel audio signals while adjusting output gains and time delays for remaining channel audio input signals such that a natural stereo perception can be provided.
2. Description of the Related Art
Generally, an audio reproduction system provides a surround sound effect, such as a 5.1 channel system, by using only two speakers.
A conventional stereo sound generation system for reproducing 5.1 channel audio through 2-channel speakers is described in WO 99/49574 (PCT/AU99/00002, filed 6 Jan. 1999, entitled, “AUDIO SIGNAL PROCESSING METHOD AND APPARATUS”).
FIG. 1 is a block diagram illustrating the conventional stereo sound generation system 1. Referring to FIG. 1 the conventional sound generation system includes a part associated with a convolution of an input signal with an impulse response by using a head related transfer function (HRTF) as a down-mixing technique to generate a 5.1-channel stereo feeling through 2-channel speakers, and a part for adding the convoluted signals to two channels.
Referring to FIG. 1, 5.1 channel audio signals are input. The 5.1 channels include a left front channel 2, a right front channel, a center front channel, a left surround channel, a right surround channel, and a low frequency effect (LFE) channel. Accordingly, in relation to the left front channel 2, a corresponding left front impulse response function 4 is convoluted with a left front signal 3. The left front impulse response function 4 is an impulse response to be received by a left ear of a listener as an ideal spike output from a left front channel speaker placed at an ideal position, and uses the HRTF. An output signal 7 is added to a left channel signal 10 for a headphone. Similarly, an impulse response function 5 corresponding to a right ear of the listener for a right channel speaker is convoluted with the left front signal 3 in order to generate an output signal 9 to be added to a right channel signal 11.
Accordingly, audio signals of the left front channel 2, the right front channel, the center front channel, the left surround channel, the right surround channel, and the LFE channel are convoluted with corresponding impulse responses, respectively, such that two signals, i.e., a left signal and a right signal, are generated for each channel. Then, left signals of the six channels are added to each other and right signals of the six channels are added to each other such that 2-channel output signals are finally obtained.
If the 2-channel output signals are reproduced, a stereo feeling is generated by two actual speakers as if virtual speakers, left front, right front, center, left surround, and right surround speakers, are disposed around the listener.
However, according to the conventional stereo sound generation system 1 illustrated in FIG. 1, if a correlation between the left surround channel and the right surround channel is high, it is difficult to generate a sound image at a rear of the listener.
Here, the high correlation indicates that sound characteristics are almost the same, and the reason why it is difficult to generate a sound image at the rear of the listener if the correlation is high is explained as follows.
A virtual sound source is formed using an HRTF, which is a characteristic of an acoustic signal at the ears of the listener (i.e., a human ear) depending on the shapes of the head and the ears of the listener. With the HRTF, 3-dimensional audio can be perceived by a phenomenon resulting from characteristics of complicated paths, such as diffraction on the skin of the listener's head, and reflection by a pinna, varies with respect to an incident direction of sound, in addition to the simple path differences, such as an inter-aural level difference (ILD) and an inter-aural time difference (ITD).
However, although the HRTF enables easy distinction between left and right sound images on a horizontal surface, it is difficult to distinguish front and rear sound images due to a standard HRTF error. In order to distinguish the positions of front and rear sound images, an accurate frequency of an actual user should be measured. Since a standard dummy head is typically used, front/rear confusion occurs due to a difference between frequency characteristics of the dummy head and the actual user.
When the surround channels are used, the effect of the surround channels can be obtained only when sound images are positioned at a left rear and a right rear of the listener. When the correlation of the audio input signals of the left and right surround channels is high, the sound image is positioned at the center of the rear of the listener. Furthermore, due to the use of the standard dummy head, the front/rear confusion also occurs, and it is difficult to obtain the effect of the surround channels.