1. Technical Field
The invention relates to a device and method for synthesizing a virtual sound source.
2. Discussion of Related Art
In stereophonic sound reproduction, the objective is to transmit a realistic sound image to the listener by means of two sound channels. In conventional stereo reproduction, the direction of incidence of the sound is determined by the amplitude and phase ratios of the sound signal on different channels. Thereby the direction perceived by the listener as the direction from which the sound is coming, is always in the area between the loudspeakers or in the direction of either of the loudspeakers.
The conventional stereo effect achieved by two loudspeakers is limited, especially when the loudspeakers of the left and right channel are close to one another, as in a television set or a portable stereophonic radio cassette recorder, for example. When both loudspeakers are almost in the same direction with respect to the listener, there are no very distinct differences in the perceived sound direction.
The increase of multimedia applications that followed the growth of the computation capacity of personal computers has increased the need for a more advanced sound reproduction than the conventional stereo reproduction, which would be able to offer the listener a more realistic three-dimensional sound environment than before. A well known method to expand the capability of a sound reproduction system to represent sound direction is the use of several sound channels and loudspeakers, which is familiar from cinemas, for example.
Man perceives the direction of the incoming sound mainly by means of interaural time differences (ITD) and interaural level differences (ILD). In a two-channel sound reproduction system, it is in principle possible to simulate all the directions of the sound by changing the above mentioned factors. In this way, it is possible to create an impression that the sound comes from a direction outside the pair of loudspeakers.
In order to create the desired differences in the desired ITDs and ILDs of the sounds, so called HRTF (Head Related Transfer Function) filters are used in this method. HRTF filters mean transfer functions specified by measurement or calculation, which describe the filtering of a sound coming from a certain direction, mostly due to the effect of the shape of the head and external ear. By means of HRTF filters, it is possible to create an artificial sound image of a virtual sound source in stereophonic loudspeaker reproduction, if crosstalk from each loudspeaker to the opposite ear is taken into account in calculation.
FIG. 1 shows the known first filter system 10 for implementing a sound image based on at least one virtual sound source. The first filter system 10 consists of a first filter block 17, which contains four parallel filters 11, 12, 13 and 14, by means of which the signals Xl and Xr brought to the system are filtered in order to create a spatial effect, and two summing devices, 15 and 16. Both channels include two filters, one of which functions as a HRTF filter 11; 14, and the other as a crosstalk cancellation filter 12; 13.
If the sound sources are placed symmetrically around the listening position, a corresponding system can be implemented more efficiently by another filter arrangement 20 shown in FIG. 2. In this implementation, the filters 11, 12, 13 and 14 have been replaced by a first 24 and a second spatial filter 25, whereby the expansion can be implemented with only two filters. When the objective is to use a system in which the properties of the filters 24, 25 can be adjusted separately, the filters 24, 25 can be connected to a separate filter control circuit 28, by means of which the filtering of the signals can be changed in order to change the sound image.
A problem in the methods described above is constituted by the HRTF filters"" complicated phase and frequency response properties. In stereophonic sound reproduction this is not a problem, because the desired spatial effect is achieved by these properties. If the signals being processed also contain monophonic signal components, the filters cause harmful distortions, because the hearing direction of the monophonic signal component need not be changed. In systems like this, the monophonic signal sounds colored. In principle, the distortion of the monophonic signal component could be corrected by adding one more filter stage to the system output, but this in turn would distort the desired spatial effect.
In this patent application, monophony means coherence between the signals of at least two channels. In a two-channel system, this means that coherence can be perceived in the signals of both channels. In a system with more channels, the monophony must be defined separately for each channel pair. Thus it is possible that the sound image contains multiple monophonic signals simultaneously.
Correspondingly, the stereophony of a signal means the portion of a signal of at least two channels between which there is no coherence. According to the above definition, it is possible that the signal consists partly of a monophonic and partly of a stereophonic signal.
FIG. 3 depicts a third filter arrangement 30 according to the patent application FI 962181, in which a third filter 31 has been added to the second filter block 21 according to FIG. 2, the delay properties of which filter correspond to the spatial filters 24 and 25. The second filter block 21, the third filter 31 added to it and the summing devices 36 and 37 together constitute the third filter block 34. In the solution according to the reference publication, sum and difference signals are calculated from the signals coming to the system in the device 32. The strength of the sum signal received is changed with amplifiers 33. The signal after the amplifiers 33 is used as an approximation of the monophonic signal contained by the channels. This approximation of the monophonic signal is subtracted from the signals of both channels, whereby essentially only a stereophonic signal remains in each channel. After this, the stereophonic signal is led to the second filter block 21 in order to produce a spatial effect, and the monophonic signal is led via the third filter 31 past the second filter block 21 to be summed back to the signals coming from the outputs of the second filter block 21.
The solution according to the patent specification FI-962181 does not entirely eliminate the colorization of the monophonic signal. In addition, a preadjusted constant value is used in this solution to reinforce the sum signal that approximates to the monophonic signal, whereby it is assumed that the ratio of monophonic and stereophonic signals remains constant. In reality, the ratios between stereophonic and monophonic signal components can vary considerably in a typical music recording, for example, which in a system based on that solution causes incomplete filtering, which is perceived as discrepancies and errors in the sound image produced.
It is the objective of this invention to achieve a new method and device for synthesizing a virtual sound source, by which the problems of the prior art described above can be eliminated.
In a method according to a first aspect of the invention, a virtual sound source is synthesized in a system which includes at least a right and a left channel for transmitting signals, and a filter block containing at least one filter and amplifier, through which the signals are conducted, is connected to the channels.
According to the first aspect of the invention, the stereophony of the signals fed to the filter system is determined by means of a mono/stereo estimator. According to this estimation, amplification coefficients are specified for the signals received from each filter, on the basis of which coefficients the signals received from filters are amplified.
In one embodiment of the method according to the invention, the stereophony of the signal is determined on the basis of the symmetry of the cross-correlation between the channels by means of a certain decision function. The decision function used can be e.g. a piecewise continuous function, such as a step or ramp function. If the signal of one channel is significantly stronger than that of the other one, in one embodiment of the invention the signal can be defined as stereophonic regardless of the value of the decision function.
In another embodiment of the method according to the invention, the sum signal of the channels that approximates to the monophonic part of the signal is conducted through a separate filter.
In yet another embodiment of the method according to the invention, the virtual location of the monophonic virtual sound source is moved off the central axis of the pair of loudspeakers.
In still another embodiment of the method according to the invention, the signal is led from the filter block before the filters to a separate filter block in order to produce early virtual room reflections, whereafter the filtered signals are summed to the signals after the filters of the original filter block. The separate filter block can contain, for example, at least a delay circuit for producing a time difference to the early room reflection to be synthesized, an equalization filter for filtering the signal in the desired frequency band, and a spatial filter for producing a spatial effect. In addition, the intensity of the signal filtered in a separate filter block can be advantageously changed according to the reflection strength coefficients estimated in the mono/stereo estimator, for example.
The device according to the second aspect of the invention includes at least a right and a left channel, to which at least one filter and amplifier are connected.
The device according to the second aspect of the invention comprises means for determining the stereophony of the signal, means for specifying the amplification coefficient of a signal received from at least one amplifier, and means for controlling at least one amplifier in accordance with the specified amplification coefficient.
In one embodiment of the device according to the invention, at least some of the means are the same.
In another embodiment of the device according to the invention, the device comprises means for simulating early room reflections in the sound image.
The invention helps to achieve a better sound image compared to the prior art, when discrepancies and errors caused by a less than optimum amplification ratio can be eliminated in cases in which the ratios of monophonic and stereophonic signals vary.
In addition, the method provides a way of implementing early room reflections, which enables the creation of a more realistic spatial effect.