Various formats have been developed for providing surround sound to a four or five speaker configuration. For example, two input formats that contain surround channels are 5.1 channel Dolby Digital AC-3® and Dolby Pro Logic®. Although many home theatres include four or five speakers, many televisions are configured with only a pair of front speakers. It may be desired to play surround signals through a stereo system that has only two front speakers and still achieve the surround sound effect to the listener produced by the rear speaker surround channels.
The above mentioned surround sound formats and other surround sound formats include rear speaker surround input signals that are intended to be played through a set of rear speakers. The rear speakers may be imaged by a pair of front speakers by transforming the rear speaker surround input signals to signals that have the same effect on a listener when the transformed signals are played through a pair of front speakers. A surround sound effect is created for a listener by transforming signals using the head related transfer function (HRTF) of the listener (or an approximate or average HRTF) to transform the rear speaker surround input signals. The transformed signals are output from a set of front speakers so that rear speakers are virtually rendered at a location behind the listener.
A series of IIR filters may be used to implement the HRTF and a crosstalk canceler is used to cancel the crosstalk between the left and right front speakers. Crosstalk cancellation is described in Schroeder, M. R., and Atal, B. S. (1963): “Computer Simulation of Sound Transmission in Rooms”, IEEE International Convention Record (7), IEEE Press, New York, and HRTF's are described in Wightman, F. L. and Kistler, D. J. (1989): “Headphone Simulation of Free-Field Listening. II: Psychophysical validation.”, J. Acoust. Soc. Am., vol. 85, pp. 868–878 which are both herein incorporated by reference for all purposes. FIG. 1 is a block diagram illustrating a system for using an HRTF to virtually render sounds at different locations around a listener.
Thus, when an appropriate HRTF is used, the rear speaker signals from a surround sound format may be made to appear to a listener to emanate from a set of virtual rear speakers. However, a problem occurs when the left and right rear speaker channels contain the same content, that is, when the left and right rear speaker channels are mono and not stereo. This is always the case for Pro Logic signals, which include one signal that is played in both of the rear channels. It is also the case with many movie soundtracks or at least portions of those soundtracks that are encoded with 5.1 channel Dolby Digital AC-3. Even though Dolby AC-3 provides for separate left and right rear surround speaker channels, it is often the case that the two channels contain completely mono or partially mono content. Only occasional sound effect sequences appear in stereo while the surround music track is often mono or very close to mono.
Unfortunately, in systems that include only front speakers, the surround mono signals do not virtualize behind the listener and instead tend to collapse to the center of the two front speakers. The surround sounds thus appear to emanate from a point directly in front of the listener between the two front speakers. In order to solve this problem, it would be desirable to convert the mono rear signal to a stereo rear signal. This mono to stereo conversion is also referred to as decorrelation. Ideally, the decorrelation should not alter the listener's perception of the two decorrelated signals any more than is necessary to create the perception of separation between the signals.
Different methods have been developed to convert mono signals to stereo in order to provide separation between the sound output from a pair of speakers. One method is to shift the pitch in each of the signals slightly in opposite directions so that the average pitch remains the same but the two signals are sufficiently different from each other to create the perception of separation to the listener. This method tends to be computationally intensive, however, and is not desirable for that reason. In addition, when one speaker output is heard more than the other, the pitch shifting may be perceived by the listener, creating an undesirable effect.
Another method is to pass the input signal to the two speakers through a pair of complementary comb filters. The outputs from the complementary comb filters combine to reproduce the original signal. However, this method relies on the two signals combining in the air to achieve the desired effect. The comb filtering of each signal results in objectionable coloration when one of the individually filtered signals is heard separately. The effect does not work at all over headphones because the signals do not combine. Thus, the method is not desirable for converting identical rear surround signals to stereo since, when the listener hears one of the uncombined signals, the listener perceives significant coloration. Both signals must combine and reach the ears of the listener to achieve a desirable result. 3D sound processing individually comb-filtered signals and expecting them to later combine in the air with a reasonable result is not feasible. The signals should be properly decorrelated before 3D sound processing. That cannot be accomplished using the complementary comb filter technique and so the technique is unsuitable.
A better method of decorrelating two identical signals is needed. Ideally, each rear surround signal should sound acceptable without being combined with the other rear surround signal. Also, it would be desirable if the decorrelation could be performed in a non-computationally intense manner. Finally, it would be desirable if the decorrelation could be adjusted to only occur when the rear surround input signals are truly mono. In addition, such an improved method of decorrelation would be useful for real speakers to provide a sense of spaciousness around the listener instead of a middle of the head sensation.