Embodiments according to the invention are related to an upmixer for upmixing a downmix audio signal into an upmixed audio signal describing one or more upmixed audio channels. Some embodiments according to the invention are related to a method and to a computer program for upmixing a downmix audio signal.
Some embodiments according to the invention are related to an improved phase processing for parametric multi-channel audio coding.
In the following, a short overview will be given and the context of the invention will be described. Recent developments in the area of parametric audio coding delivers techniques for jointly coding a multi-channel audio (e.g. 5.1) signal into one (or more) downmix channels plus a side information stream. These techniques are, for example, known as Binaural Cue Coding, Parametric Stereo, MPEG Surround, etc.
A number of publications describe the so-called “Binaural Cue Coding” parametric multi-channel coding approach, for example references [1], [2], [3], [4] and [5].
“Parametric Stereo” is a related technique for the parametric coding of a two-channel stereo signal based on a transmitted mono signal plus parameter side information. For details, reference is made to references [6] and [7].
“MPEG Surround” is an ISO (International Standardization Organization) standard for parametric multi-channel coding. For details, reference is made to reference [8].
These techniques are based on transmitting the relevant perceptual cues for human's spatial hearing in a compact form to the receiver together with the associated mono or stereo downmix-signal. Typical cues can be inter-channel level differences (ILD), inter-channel correlation or coherence (ICC) as well as inter-channel time differences (ITD) and inter-channel phase differences (IPD).
These parameters are transmitted in a frequency and time resolution adapted to the human's auditory resolution.
To recreate the properties of the original signal, the decoder may produce one or more decorrelated versions of the transmitted downmix signal. Additionally, a phase rotation of the output signals may be performed in the decoder to restore the original inter-channel phase relation.
Example Binaural Cue Coding System of FIG. 4
In the following, a generic binaural cue coding scheme will be described taking reference to FIG. 4. FIG. 4 shows a block schematic diagram of a binaural cue coding transmission system 400, which comprises a binaural cue coding encoder 410 and a binaural cue coding decoder 420. The binaural cue coding encoder 410 may for example receive a plurality of audio signals 412a, 412b, and 412c. Further, the binaural cue coding encoder 410 is configured to downmix the audio input signals 412a-412c using a downmixer 414 to obtain a downmix signal 416, which may for example be a sum signal. Further, the binaural cue coding encoder 410 may be configured to analyze the audio input signals 412a-412c using an analyzer 418 to obtain the side information signal 419. The sum signal 416 and the side information signal 419 are transmitted from the binaural cue coding encoder 410 to the binaural cue coding decoder 420. The binaural cue coding decoder 420 may be configured to synthesize a multi-channel audio output signal comprising, for example, audio channels y1, y2, . . . , yN on the basis of the sum signal 416 and inter-channel cues 424. For this purpose, the binaural cue coding decoder 420 may comprise binaural cue coding synthesizer 422 which receives the sum signal 416 and the inter-channel cues 424, and provides the audio signals y1, y2, . . . , yN. The binaural cue coding decoder 420 further comprises a side information processor 426 which is configured to receive the side information 419 and, optionally, a user input 427. The side information processor 426 is configured to provide the inter-channel cues 424 on the basis of the side information 419 and the optional user input 427.
To summarize, the audio input signals are analyzed and downmixed in the BCC encoder 410. The sum signal plus the side information is transmitted to the BCC decoder 420. The inter-channel cues are generated from the side information and local user input. The binaural cue coding synthesis generates the multi-channel audio output signal.
For details, reference is made to the articles “Binaural Cue Coding Part II: Schemes and applications,” by C. Faller and F. Baumgarte (published in: IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, November 2003).
Discussion of the Conventional Approaches
In the above-described approaches, it is difficult to appropriately control the inter-channel relation.
Accordingly, it is desirable to create a concept for upmixing a downmix signal, which provides a good accuracy with respect to an inter-channel correlation.