1. Field of the Invention
This invention relates to multichannel audio and more specifically to a method of decoding two-channel matrix encoded audio to reconstruct multichannel audio that more closely approximates a discrete surround-sound presentation.
2. Description of the Related Art
Multichannel audio has become the standard for cinema and home theater, is gaining rapid acceptance in music, automotive, computers, gaming and other audio applications, and is being considered for broadcast television. Multichannel audio provides a surround-sound environment that greatly enhances the listening experience and the overall presentation of any audio-visual system. The move from stereo to multichannel audio has been driven by a number of factors paramount among them being the consumers' desire for higher quality audio presentation. Higher quality means not only more channels but higher fidelity channels and improved separation or “discreteness” between the channels. Another important factor to consumer and manufacturer alike is retention of backward compatibility with existing speaker systems and encoded content and enhancement of the audio presentation with those existing systems and content.
The earliest multichannel systems matrix encoded multiple audio channels, e.g. left, right, center and surround (L,R,C,S) channels, into left and right total (Lt,Rt) channels and recorded them in the standard stereo format. Although these two-channel matrix encoded systems such as Dolby Prologic™ provided surround-sound audio, the audio presentation is not discrete but is characterized by crosstalk and phase distortion. The matrix decoding algorithms identify a single dominant signal and position that signal in a 5-point sound-field accordingly to then reconstruct the L, R, C and S signals. The result can be a “mushy” audio presentation in which the different signals are not clearly spatially separated, particularly less dominant but important signals may be effectively lost.
The current standard in consumer applications is discrete 5.1 channel audio, which splits the surround channel into left and right surround channels and adds a subwoofer channel (L,R,C,Ls,Rs,Sub). Each channel is compressed independently and then mixed together in a 5.1 format thereby maintaining the discreteness of each signal. Dolby AC-3™, Sony SDDS™ and DTS Coherent Acoustics™ are all examples of 5.1 systems. Recently 6.1 channel audio, which adds a center surround channel Cs, has been introduced. Truly discrete audio provides a clear spatial separation of the audio channels and can support multiple dominant signals thus providing a richer and more natural sound presentation.
Having become accustomed to discrete multichannel audio and having invested in a 5.1 speaker system for their homes, consumers will be reluctant to accept clearly inferior surround-sound presentations. Unfortunately only a relatively small percentage of content is currently available in the 5.1 format. The vast majority of content is only available in a two-channel matrix encoded format, predominantly Dolby Prologic™. Because of the large installation of Prologic decoders, it is expected that 5.1 content will continue to be encoded in the Prologic format as well. Accordingly, there remains an unfulfilled need in the industry to provide a method of decoding two-channel matrix encoded audio to reconstruct multichannel audio that more closely approximates “discrete” multichannel audio.
Dolby Prologic™ provided one of the earliest two-channel matrix encoded multichannel systems. Prologic squeezes 4-channels (L,R,C,S) into 2-channels (Lt,Rt) by introducing a phase-shifted surround sound term. These 2-channels are then encoded into the existing 2-channel formats. Decoding is a two step process in which an existing decoder receives Lt,Rt and then a Prologic decoder expands Lt,Rt into L,R,C,S. Because four signals (unknowns) are carried on only two channels (equations), the Prologic decoding operation is only an approximation and cannot provide true discrete multichannel audio.
As shown in FIG. 1, a studio 2 will mix several, e.g. 48, audio sources to provide a four-channel mix (L,R,C,S). The Prologic encoder 4 matrix encodes this mix as follows:Lt=L+0.707C+S(+90°), and  (1)Rt=R+0.707C+S(−90),  (2)which are carried on the two discrete channels, encoded into the existing two-channel format and recorded on a media 6 such as film, CD or DVD.
A Prologic matrix decoder 8 decodes the two discrete channels Lt,Rt and expands them into four discrete reconstructed channels Lr,Rr,Cr and Sr that are amplified and distributed to a five speaker system 10. Many different proprietary algorithms are used to perform an active decode and all are based on measuring the power of Lt+Rt, Lt−Rt, Lt and Rt to calculate gain factors Gi whereby,Lr=G1*Lt+G2*Rt  (3)Rr=G3*Lt+G4*Rt  (4)Cr=G5*Lt+G6*Rt, and  (5)Sr=G7*Lt+G8*Rt.  (6)
More specifically, Dolby provides a set of gain coefficients for a null point at the center of a 5-point sound field 11 as shown in FIG. 2. The decoder measures the absolute power of the two-channel matrix encoded signals Lt and Rt and calculates power levels for the L,R,C and S channels according to:Lpow(t)=C1*Lt+C2*Lpow(t−1)  (7)Rpow(t)=C1*Rt+C2*Rpow(t−1)  (8)Cpow(t)=C1*(Lt+Rt)+C2*Cpow(t−1)  (9)Spow(t)=C1*(Lt−Rt)+C2*Spow(t−1)  (10)where C1 and C2 are coefficients that dictate the degree of time averaging and the (t−1) parameters are the respective power levels at the previous instant.
These power levels are then used to calculate L/R and C/S dominance vectors according to:If Lpow(t)>Rpow(t), Dom L/R=1−Rpow(t)/Lpow(t), else Dom L/R=Lpow(t)/Rpow(t)−1,  (11)andIf Cpow(t)>Spow(t), Dom C/S=1−Spow(t)/Cpow(t), else Dom C/R=Cpow(t)/Spow(t)−1.  (12)
The vector sum of the L/R and C/S dominance vectors defines a dominance vector 12 in the 5-point sound field from which the single dominant signal should emanate. The decoder scales the set of gain coefficients at the null point according to the dominance vectors as follows:[G]Dom=[G]Null+Dom L/R*[G]R+Dom C/S*[G]C  (13)where [G] represents the set of gain coefficients G1, G2, . . . G8.
This assumes that the dominant point is located in the R/C quadrant of the 5-point sound field. In general the appropriate power levels are inserted into the equation based on which quadrant the dominant point resides. The [G]Dom coefficients are then used to reconstruct the L,R,C and S channels according to equations 3–6, which are then passed to the amplifiers and onto the speaker configuration.
When compared to a discrete 5.1 system the drawbacks are clear. The surround-sound presentation includes crosstalk and phase distortion and at best approximates a discrete audio presentation. Signals other than the single dominant signal, which either emanate from different locations or reside in different spectral bands, tend to get washed out by the single dominant signal.
5.1 surround-sound systems such as Dolby AC-3™, Sony SDDS™ and DTS Coherent Acoustics™ maintain the discreteness of the multichannel audio thus providing a richer and more natural audio presentation. As shown in FIG. 3, the studio 20 provides a 5.1 channel mix. A 5.1 encoder 22 compresses each signal or channel independently, multiplexes them together and packs the audio data into a given 5.1 format, which is recorded on a suitable media 24 such as a DVD. A 5.1 decoder 26 decodes the bitstream a frame at a time by extracting the audio data, demultiplexing it into the 5.1 channels and then decompressing each channel to reproduce the signals (Lr,Rr,Cr,Lsr,Rsr,Sub). These 5.1 discrete channels, which carry the 5.1 discrete audio signals are directed to the appropriate discrete speakers in speaker configuration 28 (subwoofer not shown).