The present invention relates generally to the extraction of direction-of-arrival information from two-channel stereo audio signals. However, it may also be employed in connection with all manner of multichannel or multitrack audio sources, provided that at least some channels associated with such sources can be considered pairwise for analysis.
In the preferred aspect utilizing a two-channel stereophonic audio source, the invention relates to determination of direction-of-arrival by comparing the two input channels in the frequency domain, and resolving the signal information, in a vector sense, into “left”, “center”, and “right” source directions. More specifically, the invention is based upon the assumption that the two input channels constitute a complimentary pair, in which signal components that appear only in the left channel are intended to arrive from left of the listening position, components that appear only in the right channel are intended to arrive from right of the listening position, components that appear equally in the left and right channels are intended to arrive from directly in front-center, and components that appear unequally in the left and right channels are intended to arrive from directions proportionately between center and left or right, as appropriate.
The basis of stereophonic sound reproduction was, from the beginning, the re-creation of a realistic two-dimensional sound field that preserved, or at least approximated, direction-of-arrival information for presentation to the listener. Early systems were not limited to two audio channels, in fact many of the earliest systems used in theaters incorporated a multitude of separate channels dispersed all around the listening location. For many reasons, particularly related to phonograph records and, later, radio transmission, most of the channels were dropped and the de facto standard for stereo signals became two channels [1].
Two-channel stereo has enjoyed a long and venerable career, and can in many circumstances provide a highly satisfying listening experience. Early attempts at incorporating more than two channels into the home listening environment did not improve the listening experience enough to justify their added cost and complexity over standard two-channel stereo, and they were eventually abandoned [2]. More recently, however, the increasing popularity of multichannel audio systems such as home theater and DVD-Audio has finally shown the shortcomings of the two-channel configuration and caused consumers to demand more realistic sound field presentations.
As a result, many modem recordings are being mixed for multichannel reproduction, generally in 5 or 5.1 channel formats. However, there is still a tremendous existing base of two-channel stereo material, in analog as well as digital form. Therefore, many heuristic methods have been, and continue to be, developed for distributing two-channel source material amongst more than two channels. These are generally based upon a “matrixing” operation in which the broadband levels of the left, right, (left+right), and (left−right) source channels are compared. In cases where the left level is much higher than the right level, the output is steered generally to the left, and vice-versa. In cases where the (left+right) level is much higher than the (left−right) level, the signals are assumed to be highly correlated and are steered generally toward the front. In cases where the (left−right) level is much higher than the (left+right) level, the signals are assumed to be highly negatively correlated and are steered generally toward the rear surround channels [3]. Most of these techniques rely heavily upon heuristic algorithms to determine the steering direction for the audio, and usually require special encoding of the signal via phase-shifting, delay, etc., in order to really work properly.
The present invention is based upon the realization that the information that can be extracted from a comparison between two signals can be put to better use than has been demonstrated in prior art. Two signals either have a lot in common (positively correlated) or they do not have a lot in common (uncorrelated or negatively correlated). Their amplitudes are either similar or different. In prior art, these attributes are studied for full-bandwidth, or nearly so, signals, and special encoding is needed during the recording process to provide steering “cues” to the playback system. The present invention analyzes the attributes in the frequency domain, and does not require any special encoding.
The result is an improved system and method that can extract highly detailed, frequency-specific direction-of-arrival information from standard, non-encoded stereo signals.