1. Technical Field
The invention relates to generation of multi channel audio signals by spatial audio decoding and in particular, but not exclusively, to generation of multi channel audio signals from a matrix encoded surround sound stereo signal.
2. Description of Related Art
Digital encoding of various source signals has become increasingly important over the last decades as digital signal representation and communication increasingly has replaced analogue representation and communication. For example, mobile telephone systems, such as the Global System for Mobile communication, are based on digital speech encoding. Also distribution of media content, such as video and music, is increasingly based on digital content encoding.
Furthermore, in the last decade there has been a trend towards multi channel audio and specifically towards spatial audio extending beyond conventional stereo signals. For example, traditional stereo recordings only comprise two channels whereas modern advanced audio systems typically use five or six channels, as in the popular 5.1 surround sound systems. This provides a more involved listening experience where the user may be surrounded by sound sources.
Various techniques and standards have been developed for communication of such multi channel signals. For example, six discrete channels representing a 5.1 surround system may be transmitted in accordance with standards such as the Advanced Audio Coding (AAC) or Dolby Digital standards.
However, in order to provide backwards compatibility, it is known to down-mix the higher number of channels to a lower number and specifically it is frequently used to down-mix a 5.1 surround sound signal to a stereo signal allowing a stereo signal to be reproduced by legacy (stereo) decoders and a 5.1 signal by surround sound decoders.
Such existing methods for backwards-compatible multi-channel transmission without additional multi-channel information can typically be characterized as matrixed-surround methods. Examples of matrix surround sound encoding include methods such as Dolby Prologic II and Logic-7. The common principle of these methods is that they matrix” multiply the multiple channels of the input signal by a suitable non-quadratic matrix thereby generating an output signal with a lower number of channels. Specifically, a matrix encoder typically applies phase shifts to the surround channels prior to mixing them with the front and center channels. The generation of the down-mixed signal (Lt, Rt) may e.g. be given by:
                              [                                                    Lt                                                                    Rt                                              ]                =                              [                                                            1                                                  0                                                  q                                                                      a                    ·                    j                                                                                        b                    ·                    j                                                                                                0                                                  1                                                  q                                                                                            -                      b                                        ·                    j                                                                                                              -                      a                                        ·                    j                                                                        ]                    ⁡                      [                                                            Lf                                                                              Rf                                                                              C                                                                              Ls                                                                              Rs                                                      ]                                              (        1        )            
Thus, the left down-mix signal (Lt) consists of the left-front signal (Lf), the center signal (c) multiplied by a factor q, the left-surround signal (Ls) phase rotated by 90 degrees (‘j’) and scaled by a factor a, and finally the right-surround (Rs) signal which is also phase rotated by 90 degrees and scaled by a factor b. The right down-mix signal (Rt) is generated similarly. Typical down-mix factors are 0.707 for q and a, and 0.408 for b.
The rationale for the opposite signs for the right-down-mix signal (Rt) is that the surround channels are mixed in anti-phase in the down-mix pair (Lt, Rt). This property helps the decoder to discriminate between front and rear channels from the down-mix signal pair. A decoder can (partially) reconstruct the multi-channel signal from the stereo down-mix by applying a de-matrixing operation. How accurately the re-created multi-channel signal resemble the original multi-channel signal will depend on the specific properties of the multi-channel audio content.
Although matrixed surround sound systems provide for backwards compatibility, it can only provide low audio quality compared to discrete surround systems/coders, such as AAC or Dolby Digital systems.
A coding/decoding technique known as Spatial Audio Coding (SAC) has been developed to provide improved quality for down-mixed audio signals. In SAC, the decoder down-mixes channels to a lower number and in addition generates parametric data which describes characteristics of the multi-channel signals relative to the down-mixed signals. The additional parametric data is then included in the bit stream together wither the down-mix signal which typically is a mono or stereo audio signal. Thus, legacy decoders can ignore the additional parametric data and re-generate a mono or stereo signal (or possibly a matrix decoded surround sound signal of low quality). Furthermore, SAC decoders can extract the parametric data and use this to generate a multi-channel signal of higher quality.
However, a problem with this approach is that many systems are not equipped for SAC encoded signals. For example, many systems only utilize matrix surround sound encoding that does not generate SAC parametric data. Furthermore, many signal and decoder standards do not provide the flexibility to allow additional parametric data to be included thus requiring a complete switch to a new standard before SAC can be deployed. This may require that all existing encoders and decoders in the system are replaced by SAC enabled encoders and decoders. Specifically, there are many two-channel stereo-based legacy systems (such as radio, digital radio, etc.) where the effort to add the additional information necessary for SAC is unfeasibly large, i.e. the cost to extend such systems to use SAC is too high. Furthermore, there are already large amounts of matrix-encoded audio material available and this would need re-encoding by a SAC encoder before the benefits of SAC decoding can be achieved.
Hence, an improved system for processing and/or communicating multi channel audio signals would be advantageous and in particular functionality allowing increased flexibility, increased audio quality, increased applicability of SAC principles and/or improved performance would be advantageous.