Audio recordings may be recorded in a multi-channel format having two or more audio channels, each of which may be tailored to a specific position relative to the listener. One such format is “5.1 channel”, which has five channels of full-bandwidth audio, i.e. front left (Lf), center (Center), front right (Rf), rear left (Lr), rear right (Rr), plus a sixth, narrow-bandwidth, low-frequency effect channel (Lfe). Reproducing audio media recorded in such a format typically requires one speaker and amplifier for each channel; therefore, the “5.1 channel” format described above would require six separate speakers and amplifiers.
This multiple speaker, multiple amplifier realization is costly. Also, even when fully implemented, the discrete manner in which audio sound channels in a 3-dimensional audio environment are recorded makes it difficult to provide smooth, continuous and rich 3D audio sounding like what one may experience in an actual scenario, such as a music auditorium.
In living room entertainment, two-speaker system playback is both popular and cost-effective. A two-speaker system typically includes a left speaker (Spl) and a right speaker (Spr) along with corresponding amplifiers. In order to reproduce media recorded in the 5.1 channel format using a two-speaker system, the five channel signals (Lf, Rf, Center, Lr, and Rr) may be down-mixed into two channels, left down-mix channel (Ldm) and right down-mix channel (Rdm), then fed to the left and right speakers, Spl and Spr, accordingly. One example of down-mixing is shown in equations 1 and 2 below.Ldm=0.5*(Lf+0.7*Center)+Lr.  (eq. 1)Rdm=0.5*(Rf+0.7*Center)+Rr.  (eq. 2)
In this example, while facing the Spl and Spr speakers, listeners may sense only limited front space, i.e., the audio sound positioned in front of them, and may not sense the rear space that usually surrounds the listeners from behind. In addition, in this example, the front space is perceived as neither smooth nor continuous. Consequently, such poor 3D performance illustrates the desirability of a 3D sound processing technique that restores or improves the corresponding spaces in the multi-channel sounds, thereby providing 3D effects even though only two speakers are physically present.
There are other sound processing methods available that receive multi-channel audio media and perform signal processing in an attempt to recreate the multi-channel audio media using a two-channel audio system. These sound processing methods rely on modeling 3D perception in the human auditory system. One method of 3D sound processing for a two-speaker system is based on an Interaural Time Delay (ITD) effect combined with filters for modeling the hearing behavior of human ear.
In the ITD-based system 10, shown in FIG. 10, each channel branches out into two channels as follows: for left channel input 12, for example, rear left channel, one branch includes a filter F1 16 that simulates the left ear response to the left side sound (same side or near ear response). The output of F1 16 is then sent to adder 28. In another branch, the left channel input 12 passes through an ITD delay unit 18, followed by a filter F2 24 that simulates the right ear response to the left side sound (opposite side or far ear response), then is sent to adder 30. The ITD delay unit 18 (approximately 10 samples) positions the left channel sound toward the right side.
The same process may be applied to the right channel input 14 which also branches out into two channels. Similarly, filter F1 22 simulates the right ear response to the right side sound, the ITD delay unit 20 positions the right channel sound toward left side, and filter F2 26 simulates the left ear response to the right side sound.
Adder 28 adds the output of filter F1 16 to the output of filter F2 26 and sends the output to Left Output 32. Similarly, adder 30 adds the output of filter F1 22 to the output of filter F2 24 and sends the output to Right Output 34. Left Output 32 and Right Output 34 may then be combined with other channels and output by a two-speaker system.
This method, as well as its modeling strategy, is not successful enough in moving the rear sound far behind the listener to create a satisfying rear surround effect.