Audio processing has advanced in many ways. In particular, surround systems have become more and more important. However, most music recordings are still encoded and transmitted as a stereo signal and not as a multi-channel signal. As surround systems comprise a plurality of loudspeakers, e.g. four or five, it has been subject of many studies what signals to provide to which one of the loudspeakers, when there are only two input signals available. Providing the first input signal unaltered to a first group of loudspeakers and the second input signal unaltered to a second group would of course be a solution. But the listener would not really get the impression of real-life surround sound, but instead would hear the same sound from different speakers.
Moreover, consider a surround system comprising five loudspeakers including a center speaker. To provide the user a real-life sound-experience, sounds that in reality originate from a location in front of the listener should be reproduced by the front speakers and not by the left and right surround loudspeakers behind the listener. Therefore, audio signals should be available which do not comprise such sound portions.
Furthermore, listeners desiring to experience real-life surround sound also expect high-quality audio sound from the left and right surround loudspeakers. Providing both surround speakers with the same signal is not a desired solution. Sounds that originate from the left of the listener's location should not be reproduced by the right surround speaker and vice versa.
However, as already mentioned, most music recordings are still encoded as stereo signals. A lot of stereo music productions employ amplitude panning. Sound sources sk are recorded and are subsequently panned by applying weighting masks ak such that, in a stereo system, they appear to originate from a particular position between a left loudspeaker receiving a left stereo channel xL of a stereo input signal and a right loudspeaker receiving a right stereo channel xR of the stereo input signal. Moreover, such recordings comprise ambient signal portions n1, n2, originating, e.g., from room reverberation. Ambient signal portions appear in both channels, but do not relate to a particular sound source. Therefore, the left xL and the right xR channel of a stereo input signal may comprise:
            x      L        =                            ∑          k                ⁢                  s          k                    +              n        1                        x      R        =                            ∑          k                ⁢                              a            k                    ·                      s            k                              +              n        2            xL: left stereo signalxR: right stereo signalak: panning factor of sound source ksk: signal sound source kn1, n2: ambient signal portions
In surround systems, commonly, only some of the loudspeakers are assumed to be located in front of a listener's seat (for example, a center, a front left and a front right speaker), while other speakers are assumed to be located to the left and to the right behind a listener's seat (e.g., a left and a right surround speaker).
Signal components that are equally present in both channels of the stereo input signal (sk=ak·sk) appear to originate from a sound source at a center position in front of the listener. It may therefore be desirable, that these signals are not reproduced by the left and the right surround speaker behind the listener.
It may moreover be desirable that signal components that are mainly present in the left stereo channel (sk>>ak·sk) are reproduced by the left surround speaker; and that signal components that are mainly present in the right stereo channel (sk<<ak·sk) are reproduced by the right surround speaker.
Moreover, it may furthermore be desirable, that ambient signal portion n1 of the left stereo channel shall be reproduced by the left surround speaker while the ambient the signal portion n2 of the right stereo channel shall be reproduced by the right surround speaker.
To provide the left and the right surround speaker with suitable signals, it would therefore be highly appreciated to provide at least two output channels from two channels of a stereo input signal which are different from the two input channels and which possess the described properties.
The desire for generating a stereo output signal from a stereo input signal is however not limited to surround systems, but may also be applied in traditional stereo systems. A stereo output signal might also be useful to provide a different sound experience, for example, a wider sound field for traditional stereo systems having two loudspeakers, e.g., by providing stereo-base widening. Regarding replay using stereo loudspeakers or earphones, a broader and/or enveloping audio impression may be generated.
According to a first method of conventional technology, a mono input source is processed to generate a stereo signal for playback, thus creating two channels from the mono input source. By this, an input signal is modified by complementary filters to generate a stereo output signal. When being replayed by two loudspeakers, the generated stereo signal creates a wider sound than the unfiltered replay of the same signal. However, the sound sources comprised in the stereo signal are “smeared”, as no directional information is generated. Details are presented in:
Manfred Schroeder “An Artificial Stereophonic Effect Obtained From Using a Single Signal”, presented at the 9th annual AES meeting Oct. 8-12, 1957.
Another proposed approach is presented in WO 9215180 A1: “Sound reproduction systems having a matrix converter”. According to this conventional technology, a stereo output signal is generated from a stereo input signal by applying a linear combination of the channels of the stereo input signal. By applying this method, output signals may be generated which significantly attenuate center-panned portions of the input signal. However, the method also results in a lot of crosstalk (from the left channel to the right channel and vice versa). Crosstalk may be reduced by limiting the influence of the right input signal to the left output signal and vice versa, in that the corresponding weighting factor of the linear combination is adjusted. This however, would also result in reduced attenuation of center-panned signal portions in the surround speakers. Signals, originating from a front-center location would unintentionally be reproduced by the rear surround speakers.
Another proposed concept of conventional technology is to determine direction and ambience of a stereo input signal in a frequency domain by applying complex signal analysis techniques. This concept of conventional technology is, e.g., presented in U.S. Pat. No. 7,257,231 B1, U.S. Pat. No. 7,412,380 B1 and U.S. Pat. No. 7,315,624 B2. According to this approach, both input signals are examined with respect to direction and ambience for each time-frequency bin and are repanned in a surround system depending on the result of the direction and ambience analysis. According to this approach, a correlation analysis is employed to determine ambient signal portions. Based on the analysis, surround channels are generated which comprise predominantly ambient signal portions and from which center-panned signal portions may be removed. However, as both directional analysis as well as ambience extraction is based on estimations which are not always free of errors, undesired artifacts may be generated. The problem of generated undesired artifacts increases, if an input signal mix comprises several signals (e.g., of different instruments) with superimposed spectra. An effective signal-dependent filtering may be used for removing center-panned portions from the stereo signal, which however makes estimation errors caused by “musical noise” clearly visible. Moreover, the combination of a direction analysis and ambience extraction furthermore results in an addition of artifacts from both methods.