1. Field of the Invention
This invention relates generally to audio engineering. More specifically, it relates to upmixing two-channel audio to three or more output channels.
2. Related Art
Presently, there are two categories of two- to three (or more)-channel upmix algorithms: multichannel converters and ambience generators.
Multichannel converters, which include linear (“passive”) and steered (“active”) matrix methods, are used to derive additional loudspeaker signals in cases where there are more speakers than input channels. These methods are typically implemented in the time domain. While linear matrix methods are relatively inexpensive to implement, they reduce the width of the front image. In a two- to three-channel upmix, any signal intended for the center is also played through the left and right speakers; the channel separation between left and center, for example, is only 3 dB.
Matrix steering methods update the matrix coefficients dynamically and provide the ability to extract and boost a dominant source. These methods are particularly useful for content such as movie soundtracks, in which one source may be of primary interest at any given time, but the signal-dependent gain changes may cause audible side effects with music.
Ambience generation methods attempt to extract or simulate the ambience of a recording. The term “ambience” refers to the components of a sound that create the impression of an acoustic environment, with sound coming from all around the listener but not from a specific place. Ambience may include room reverberation as well as other spatially distributed sounds such as applause, wind or rain. The goal of the ambience extraction is to increase the sense of envelopment, typically using the rear speakers.
Ambience generation methods may extract the natural reverberation from the audio signal, for example, by taking the difference of the left and right inputs, which attenuates centered sounds and preserves those that are weakly correlated or panned to the sides, or they may add artificial reverberation.
Recently, a number of researchers have developed frequency-domain upmix (and downmix) techniques for spatial audio coding and enhancement. These methods typically perform spatial decomposition and extract the existing ambience. Thus, these are categorized as ambience generation methods, but they can also be thought of as frequency-domain steering methods, because they dynamically change the panning of each frequency subband based on the correlation between the left and right input signals.
Frequency domain upmix techniques have been presented, based on inter-channel coherence measures, non-linear mapping functions and panning coefficients. Short-time Fourier transform (STFT)-based processing has been used to extract the ambient and direct components using least-squares estimation, Principal Components Analysis (PCA) and other methods.
One commercial upmix algorithm displays good center channel separation, but when the center channel is heard by itself, significant “watery sound” or “musical noise” artifacts are heard. Another commercial algorithm does not have obvious center channel artifacts, but it appears to have a low amount of center channel separation. There is a need for an upmix algorithm that provides good center channel separation without serious artifacts.