The present invention relates generally to audio signals. More specifically, a dynamic decorrelator for surround sound signals is disclosed.
Various formats have been developed for providing surround sound to a four or five speaker configuration. For example, two input formats that contain surround channels are 5.1 channel Dolby Digital AC-3(copyright) and Dolby Pro Logic(copyright). Although many home theatres include four or five speakers, many televisions are configured with only a pair of front speakers. It may be desired to play surround signals through a stereo system that has only two front speakers and still achieve the surround sound effect to the listener produced by the rear speaker surround channels.
The above mentioned surround sound formats and other surround sound formats include rear speaker surround input signals that are intended to be played through a set of rear speakers. The rear speakers may be imaged by a pair of front speakers by transforming the rear speaker surround input signals to signals that have the same effect on a listener when the transformed signals are played through a pair of front speakers. A surround sound effect is created for a listener by transforming signals using the head related transfer function (HRTF) of the listener (or an approximate or average HRTF) to transform the rear speaker surround input signals. The transformed signals are output from a set of front speakers so that rear speakers are virtually rendered at a location behind the listener.
A series of IIR filters may be used to implement the HRTF and a crosstalk canceler is used to cancel the crosstalk between the left and right front speakers. Crosstalk cancellation is described in Schroeder, M. R., and Atal, B. S. (1963): xe2x80x9cComputer Simulation of Sound Transmission in Roomsxe2x80x9d, IEEE International Convention Record (7), IEEE Press, New York, and HRTF""s are described in Wightman, F. L. and Kistler, D. J. (1989): xe2x80x9cHeadphone Simulation of Free-Field Listening. II: Psychophysical validation.xe2x80x9d, J. Acoust. Soc. Am., vol. 85, pp. 868-878 which are both herein incorporated by reference for all purposes. FIG. 1 is a block diagram illustrating a system for using an HRTF to virtually render sounds at different locations around a listener.
Thus, when an appropriate HRTF is used, the rear speaker signals from a surround sound format may be made to appear to a listener to emanate from a set of virtual rear speakers. However, a problem occurs when the left and right rear speaker channels contain the same content, that is, when the left and right rear speaker channels are mono and not stereo. This is always the case for Pro Logic signals, which include one signal that is played in both of the rear channels. It is also the case with many movie soundtracks or at least portions of those soundtracks that are encoded with 5.1 channel Dolby Digital AC3. Even though Dolby AC3 provides for separate left and right rear surround speaker channels, it is often the case that the two channels contain completely mono or partially mono content. Only occasional sound effect sequences appear in stereo while the surround music track is often mono or very close to mono.
Unfortunately, in systems that include only front speakers, the surround mono signals do not virtualize behind the listener and instead tend to collapse to the center of the two front speakers. The surround sounds thus appear to emanate from a point directly in front of the listener between the two front speakers. In order to solve this problem, it would be desirable to convert the mono rear signal to a stereo rear signal. This mono to stereo conversion is also referred to as decorrelation. Ideally, the decorrelation should not alter the listener""s perception of the two decorrelated signals any more than is necessary to create the perception of separation between the signals.
Different methods have been developed to convert mono signals to stereo in order to provide separation between the sound output from a pair of speakers. One method is to shift the pitch in each of the signals slightly in opposite directions so that the average pitch remains the same but the two signals are sufficiently different from each other to create the perception of separation to the listener. This method tends to be computationally intensive, however, and is not desirable for that reason. In addition, when one speaker output is heard more than the other, the pitch shifting may be perceived by the listener, creating an undesirable effect.
Another method is to pass the input signal to the two speakers through a pair of complementary comb filters. The outputs from the complementary comb filters combine to reproduce the original signal. However, this method relies on the two signals combining in the air to achieve the desired effect. The comb filtering of each signal results in objectionable coloration when one of the individually filtered signals is heard separately. The effect does not work at all over headphones because the signals do not combine. Thus, the method is not desirable for converting identical rear surround signals to stereo since, when the listener hears one of the uncombined signals, the listener perceives significant coloration. Both signals must combine and reach the ears of the listener to achieve a desirable result. 3D sound processing individually comb-filtered signals and expecting them to later combine in the air with a reasonable result is not feasible. The signals should be properly decorrelated before 3D sound processing. That cannot be accomplished using the complementary comb filter technique and so the technique is unsuitable.
A better method of decorrelating two identical signals is needed. Ideally, each rear surround signal should sound acceptable without being combined with the other rear surround signal. Also, it would be desirable if the decorrelation could be performed in a non-computationally intense manner. Finally, it would be desirable if the decorrelation could be adjusted to only occur when the rear surround input signals are truly mono. In addition, such an improved method of decorrelation would be useful for real speakers to provide a sense of spaciousness around the listener instead of a middle of the head sensation.
A dynamic decorrelator for surround sound signals is disclosed. In one embodiment, a mono detection circuit is used to detect the extent to which a left rear surround input signal and a right rear surround input signal are similar. To the extent that the surround input signals are similar, the signals are decorrelated. Decorrelation is performed by a pair of allpass filters that introduce complementary phase shifts in the left rear surround input signal and the right rear surround input signal. The complementary phase shifts are sufficient to prevent the surround signals from collapsing to the front of the listener when they are rendered using a pair of front speakers.
It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or electronic communication lines. Several inventive embodiments of the present invention are described below.
In one embodiment, a method of rendering a left rear surround input signal at a left rear virtual speaker location and rendering a right rear surround input signal at a right rear virtual speaker location is described. The method includes phase shifting the left rear surround input signal by a first phase shift. The right rear surround input signal is phase shifted by a second phase shift. The phase shifted left rear surround input signal is phase shifted using an HRTF selected to render the left rear surround input signal at the left rear virtual speaker location. The phase shifted right rear surround input signal is transformed using an HRTF selected to render the right rear surround input signal at the right rear virtual speaker location.
In another embodiment, a method of decorrelating a first input signal and a second input signal is described. The method includes phase shifting the first input signal by a first phase shift and phase shifting second input signal by a second phase shift. The first input signal and the second input signal are decorrelated in a manner that does not distort either the first input signal or the second input signal in the perception of a listener when one of the input signals is heard without being combined with the other input signal.
In another embodiment, a method of converting a mono input signal to a pair of stereo input signals is described. The method includes filtering the mono input signal using a band pass filter. The band pass filter substantially passes frequencies in a vocal range of frequencies and substantially blocks frequencies outside of the vocal range of frequencies to produce a band pass filter output signal. The mono input signal is filtered using a high pass filter. The high pass filter substantially passes frequencies above a vocal range of frequencies and substantially blocks frequencies within the vocal range of frequencies and frequencies below the vocal range of frequencies to produce a high pass filter output signal. The mono input signal is filtered using a low pass filter. The low pass filter substantially passes frequencies below a vocal range of frequencies and substantially blocks frequencies within the vocal range of frequencies and frequencies above the vocal range of frequencies to produce a low pass filter output signal. The low pass filter output signal and the high pass filter output signal are decorrelated to produce at least a pair of decorrelated signals and each of the decorrelated signals are combined with the band pass filter output signal to produce a stereo output signal that includes decorrelated signals above and below the vocal range of frequencies.
In another embodiment, a dynamic decorrelator for decorrelating a first input signal and a second input signal is described. The dynamic decorrelator includes a first allpass filter configured to phase shift the first input signal by a first phase shift and a second allpass filter configured to phase shift the second input signal by a second phase shift. A mono detection circuit is configured to detect the similarity of the first input signal and the second input signal and to adjust the first phase shift and the second phase shift according to the similarity of the first input signal and the second input signal.
These and other features and advantages of the present invention will be presented in more detail in the following detailed description and the accompanying figures which illustrate by way of example the principles of the invention.
The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, wherein like reference numerals designate like structural elements, and in which:
FIG. 1 is a block diagram illustrating a 3D sound virtualization system that performs HRTF modeling and cross talk cancellation for the purpose of virtually rendering a pair of speakers at a position relative to a listener where no real speakers are located.
FIG. 2A is a graph illustrating an exemplary phase excursion implemented using an all pass filter.
FIG. 2B is a graph illustrating the phase excursion implemented by the decorrelator in the other surround input.
FIG. 3A is a block diagram illustrating a system used to dynamically decorrelate surround sound signals.
FIG. 3B is a block diagram illustrating a system for producing a left surround virtual and a right surround virtual speaker input signal given left and right surround signals that are the same.
FIG. 3C is a block diagram illustrating a system for combined left and right front signals, designated RF and LF with left and right surround virtual signals, RSV and LSV, to produce a combined left output, LO, and right output, RO, signal.
FIG. 4 is a block diagram illustrating a design of a monodetector implemented in block 302 of FIG. 3A in one embodiment.
FIGS. 5A and 5B are block diagrams illustrating a pair of all pass filters that provide complementary phase shifts to a left surround signal LS and a right surround signal RS.
FIG. 6 is a block diagram of a system for providing decorrelation of portions of a mono signal while not decorrelating dialogue.
FIG. 7 illustrates the frequency response three filters used in one embodiment.