A conventional stereophonic system controls sound image localization using a plural of (generally two) loudspeakers, conferring a realistic sensation to the hearing of a listener. The conventional system usually includes two laterally spaced loudspeakers in front of the listener, so a sound image is localized between them. Outside the two loudspeakers no sound image is localized in the system. To obtain the effect that a sound image is localized outside the two loudspeakers, i.e., the surround of the listener, for instance, a sound from the back of the listener, the system sometimes includes loudspeakers at the rear as well as the two loudspeakers in front of the listener.
The development of technology for digitizing audio and hardware for DSP (Digital Signal Processor) facilitates various signal processing. Owing to this, the system using two loudspeakers in front of the listener can localize a sound image at any position around the listener, such as the side and rear of the listener.
Conventional sound image localization apparatus are disclosed in Japanese Patent Published Application Nos. Hei 3-270400 (1991); Hei 4-273800 (1992). A description will be given of a typical, conventional sound image localization apparatus.
FIGS. 19(a) and 19(b) are diagrams for explaining about sound image localization. FIG. 19(a) shows a sound image to be localized in a virtual way. FIG. 19(b) shows a system using two loudspeakers. In this case, it is assumed that the positions of virtually localized sound images, and the positions of the two loudspeakers are left-and-right symmetrical with respect to the listener.
In the sound image localization apparatus, a direction of a virtual position is localized and crosstalk is canceled by signal processing using a head related transfer function indicating transfer characteristics of sound from a sound source to the listener's head or ear.
Here, in case like FIG. 19(b), a crosstalk signal is a signal transferred from a left loudspeaker to a right ear, or from a right loudspeaker to left ear. A signal is generated for canceling the crosstalk signal.
In the virtual environment achieved by this system as shown in FIG. 19(a), sound signals uL and uR are radiated from the positions of virtual sound images located laterally at the back of the listener. Reference numerals, yL1 and yR1, indicate sound pressures given to left and right ears, respectively. Because of the left-and-right symmetry, transfer of sound from the left virtual position to the left ear is the same as that from the right virtual position to the right ear. A head related transfer function showing this transfer characteristics is indicated by TM. The transfer of sound from the left virtual position to the right ear and that from the right virtual position to the left ear are represented by the same head related transfer function TC. The relation between the sound pressures and the functions are represented by EQU yL1=TM.multidot.uL+TC.multidot.uR (1-1) and EQU yR1=TC.multidot.uL+TM.multidot.uR (1-2).
On the other hand, in a system shown in FIG. 19(b), left and right loudspeakers 1901a and 1901b radiate sound signals xL and xR, respectively. Sound pressures given to the left and right ears of the listener are yL2 and yR2, respectively. As they are left-and-right symmetrical, the transfer of sound from the left loudspeaker position to the left car and that from the right loudspeaker position to the right ear are represented by the same head related transfer function SM. The transfer of sound from the left loudspeaker position to the right ear and that from the right loudspeaker position to the left ear are also represented by the same head related transfer function SC. The relation between those sound pressures and those functions are EQU yL2=SM.multidot.xL+SC.multidot.xR (2-1) and
yR2=SC.multidot.xL+SM.multidot.xR (2-2).
In this system, to localize the positions of the sound images shown in FIG. 19(a) using acoustics output from the loudspeakers 1901a and 1901b, the following equations must be satisfied, EQU yL1=yL2 (3-1) and EQU yR1=yR2 (3-2).
The equations 3-1, 1-1, and 2-1 lead to the following equation 4-1, and the equations is 3-2, 1-2, and 2-2 lead to the following equation 4-2, EQU TM.multidot.uL+TC.multidot.uR=SM.multidot.xL+SC.multidot.xR (4-1) and EQU TC.multidot.uL+TM.multidot.uR=SC.multidot.xL+SM.multidot.xR (4-2).
The solution to xL and xR is obtained from the equations 4-1 and 4-2. If assumed that, the gain being represented by .dbd.*.dbd., EQU .dbd.(SC/SM).sup.2.dbd.&lt;&lt;1 (5),
xL and xR are approximated by EQU xL.about.(FM+FC.multidot.FX).multidot.uL+(FC+FM.multidot.FX).multidot.uR (6-1) and EQU xR.about.(FC+FM.multidot.FX).multidot.uL+(FM+FC.multidot.FX).multidot.uR (6-2), EQU where FM=TM/SM (7-1), EQU FC=TC/SM (7-2), and EQU FX=-SC/SM (7-3).
Using the above relations, a conventional sound image localization apparatus is constructed, shown in FIG. 18(a) . The conventional sound image localization apparatus comprises a crosstalk canceling means 1801, direction localizing means 1802a and 1802b, and adders 1803a and 1803b. Sound signals are input through input terminals 1804a and 1804b. Signals resulting from subjecting the input sound signals to signal processing are output through output terminals 1805a and 1905b.
The direction localizing means 1802a and 1802b process the sound signals input through the input terminals 1804a and 1804b to generate signals indicating the directions of sound image positions, respectively. The adders 1803a and 1803b add input signals. The crosstalk canceling means 1801 removes a crosstalk component of an input signal.
FIG. 18(b) is a diagram illustrating a detailed structure of an example of the conventional sound image localization apparatus. The crosstalk canceling means 1801 shown in FIG. 18(a) comprises crosstalk canceling signal generating filters 1806a and 1806b, and adders 1803c and 1803d. The direction localizing means 1802a and 1802b shown in FIG. 18(a) comprise main-path filters 1807a and 1807b, and crosstalk-path filters 1808a and 1808b, respectively. The combination of the main-path filter and the crosstalk-path filter is sometimes called a direction localizing filter.
The prior art sound image localization apparatus generates the outputs xL and xR according to the expressions 6-1 and 6-2. A description will be given of how the sound image localization apparatus works.
Left and right input sound signals are input through the input terminals 1804a and 1804b, respectively. The first input sound signal input through the input terminal 1804a is input to the main-path filter 1807a and the crosstalk-path filter 1808a. The main-path filter 1807a multiplies the input signal by the coefficient shown in the equation 7-1. The crosstalk-path filter 1808a multiplies the input signal by the coefficient shown in the equation 7-2. The outputs of the main-path filter 1807a and the crosstalk-path filter 1808a are input to the adders 1803a and 1803b, respectively.
Similarly, the second input sound signal input through the input terminal 1804b is input to the main-path filter 1807b and the crosstalk-path filter 1808b, where the input signal is multiplied by the coefficients expressed by 7-1 and 7-2, respectively. The outputs of the main-path filter 1807b and the crosstalk-path filter 1808b are input to the adders 1803b and 1803a, respectively.
The adders 1803a and 1803b each add input signals. The adder 1803a outputs a result of the addition to the adder 1803c and the crosstalk canceling signal generating filter 1806a. The crosstalk canceling signal generating filter 1806a multiplies the input signal by the coefficient represented by the equation 7-3 to produce a crosstalk canceling signal signal, and outputs the signal to the adder 1803d.
Similarly, the adder 1803b outputs a result of the addition to the adder 1803d and the crosstalk canceling signal generating filter 1806b. The crosstalk canceling signal generating filter 1806b multiplies the input signal by the coefficient represented by the equation 7-3 to produce a crosstalk canceling signal, and outputs the signal to the adder 1803c.
The adders 1803c and 1803d each add results of addition by the adders 1803a and 1803b to the crosstalk canceling signal having phase almost equivalent to the inversed phase of the result of the addition, respectively. Thus, signals represented by the expressions 6-1 and 6-2, of which crosstalk components are removed, are output through the output terminals 1805a and 1805b, respectively.
In the sound image localization apparatus having the structure shown in FIG. 18(b), the output of a crosstalk canceling signal generating filter on either channel (for example, 1806a) is output to the output side of the other channel (the adder 1803d on the side having the output terminal 1805b). This structure is called feedforward.
As described above, the conventional sound image localization apparatus can localize a sound image over a wide range by localization of a virtual sound image and compensation of a crosstalk component. However, when trying to realize the foregoing sound image localization apparatus by a computer system using a CPU and a DSP, the following several problems arise.
The first problem is that because in this feedforward type sound image localization apparatus the crosstalk canceling signal is output to the output side of the whole apparatus, the canceling of crosstalk cannot be repeated, whereby the adverse effect of sound diffraction of low-frequency component becomes serious. Thus, it is difficult to improve low-frequency characteristics to make sound quality better.
The second problem is about a memory used for temporary storage in operational processing. The amount and performance of a memory in a computer system limit operational processing. The main constraints on memory are
(A) constraint on the amount of memory for storage of sound signal data, PA1 (B) constraint on the amount of memory for storage of coefficients of a filter, and PA1 (C) constraint on accessing time of a memory.
As to (A) and (B), when the number of words showing the amount of memory is small, the number of taps indicating the order of a filter is limited to an insufficient size, resulting in a reduction in precision of operational processing.
Furthermore, when the amount of a high-speed internal memory included in a computer system is limited, if a relatively low-speed external memory (RAM) assists to secure a required precision of operational processing, the problem (C) arises. Because frequent memory accesses occur in operational processing realizing the above-described digital filter performing directional localization and crosstalk cancellation, a simple supplement of the external memory having a low accessing speed hardly solves the constraint on the amount of memory.
The third problem relates to a controller included in a computer system, such as DSP. The processing speed of the controller limits operational processing. When the processing speed is not sufficient, the order of a digital filter is limited, thereby reducing precision in operational processing.
The fourth problem is that it is difficult for the conventional sound image localization apparatus to deal with changes in setting of an acoustic system using it. For example, when loudspeakers are rearranged in the acoustic system in such a way as that the angle the loudspeakers attain changes, the conventional sound image localization apparatus modifies all the parameters of the filter FX. Thus, to adapt to changes in setting of the acoustic system, parameters for each setting are required to be held. The requirement of storage of parameters increases the amount of a memory.
As those problems indicate, the prior art sound image localization apparatus has a difficulty in improving low-frequency characteristics. Furthermore, when implemented in a computer system, the apparatus requires the large amount of memory and the high-speed of processing, thereby making it difficult to realize both precision of controlling sound image localization and a reduction in costs of the computer system.