Sound sources or sound objects have spatial attributes that include their perceived position, and a perceived size or width. In general, the perceived width of an object is closely related to the mathematical concept of inter-aural correlation or coherence of the two signals arriving at our eardrums. Decorrelation is generally used to make an audio signal sound more spatially diffuse. The modification or manipulation of the correlation of audio signals is therefore commonly found in audio processing, coding, and rendering applications. Manipulation of the correlation or coherence of audio signals is typically performed by using one or more decorrelator circuits, which take an input signal and produce one or more output signals. Depending on the topology of the decorrelator, the output is decorrelated from its input, or outputs are mutually decorrelated from each other. The correlation measure of two signals can be determined by calculating the cross-correlation function of the two signals. In general, the correlation measure is the value of the peak of the cross-correlation function (often referred to as coherence) or the value at lag (relative delay) zero (the correlation coefficient). Decorrelation is defined as having a normalized cross-correlation coefficient or coherence smaller than +1 when computed over a certain time interval of duration T:
      ρ    =                            ∫          0          T                ⁢                              x            ⁡                          (              t              )                                ⁢                      y            ⁡                          (              t              )                                ⁢          dt                                                  ∫            0            T                    ⁢                                                    x                2                            ⁡                              (                t                )                                      ⁢            dt            ⁢                                          ∫                0                T                            ⁢                                                                    y                    2                                    ⁡                                      (                    t                    )                                                  ⁢                dt                                                              Φ    =          max      ⁢                                    ∫            0            T                    ⁢                                    x              ⁡                              (                                  t                  +                                      τ                    /                    2                                                  )                                      ⁢                          y              ⁡                              (                                  t                  -                                      τ                    /                    2                                                  )                                      ⁢                                                  ⁢            dt                                                              ∫              0              T                        ⁢                                                            x                  2                                ⁡                                  (                                      t                    +                                          τ                      /                      2                                                        )                                            ⁢              dt              ⁢                                                ∫                  0                  T                                ⁢                                                                            y                      2                                        ⁡                                          (                                              t                        -                                                  τ                          /                          2                                                                    )                                                        ⁢                  dt                                                                        
In the above equations, x(t), y(t) are the signals subject to having a mutually low correlation, p is the normalized cross-correlation coefficient, and the coherence. The coherence value is equivalent to the maximum of the normalized cross-correlation function across relative delays τ.
In spatial audio processing, signal decorrelation can have a significant impact on the perception of sound imagery, and the correlation of measure is a significant predictor of perceptual effects in audio reproduction. FIG. 1 illustrates two configurations of a simple decorrelator, as known in the prior art. The upper circuit 100 decorrelates the output signal y(t) from the input signal x(t), while the lower circuit 101 produces two mutually decorrelated outputs y(t) and x(t), which may or may not be decorrelated from the common input. A wide variety of decorrelation processes have been proposed for use in current systems, varying from simple delays, frequency-dependent delays, random-phase all-pass filters, lattice all-pass filters, and combinations thereof. These processes all significantly modify their input signals, such as by changing their waveforms. For stationary or smoothly continuous signals, such modification is generally not problematic. However, for impulsive or fast-changing signals (transients), such modification may result in unwanted distortion. For example, with regard to the onset of a transient signal, modifying the waveform by decorrelation can cause temporal smearing or similar effects. Likewise, upon cessation of the transient signal, decorrelation may result in post- echo or reverberation-like effects that are audible when the input signal has a steep decrease in level over time due to the inherent decay times associated with filters and associated circuitry. Thus, the filtering process involved in decorrelation often results in a degraded transient response, or transient ‘crispness’.
To overcome such undesirable effects, decorrelation circuits often have a level adjustment stage following the filter structures to attenuate these artifacts, or other similar post-decorrelation processing. Thus, present decorrelation circuits are limited in that they attempt to correct temporal smearing and other degradation effects after the decorrelation filters, rather than performing an appropriate amount of decorrelation based on the characteristics and components of the input signal itself. Such systems, therefore, do not adequately solve the issues associated with impulse or transient signal processing. Specific drawbacks associated with present decorrelation circuits include degraded transient response, susceptibility to downmix artifacts, and a limitation on the number of mutually-decorrelated outputs.
With respect to the issue of degraded transient response, the aim of current decorrelators is to decorrelate the complete input signal, irrespective of its contents or structure. Specifically, transient signals (e.g., the onset of percussive instruments) are in actual recordings usually not decorrelated, while their sustaining part, or the reverberant part present in a recording, is often decorrelated. Prior-art decorrelation circuits are generally not capable of reproducing this distinction, and hence their output can sound unnatural or may have a degraded transient response as a result.
With respect to the issue of downmix artifacts, the outputs of decorrelators are often not suitable for downmixing due to the fact that part of the decorrelation process involves delaying the input. Summing a signal with a delayed version thereof results in undesirable comb-filter artifacts due to the repetitive occurrence of peaks and notches in the summed frequency spectrum. As downmixing is a process that occurs frequently in audio coders, AV receivers, amplifiers, and alike, this property is problematic in many applications that rely on decorrelation circuits.
With respect to the issue of the limited number of mutually decorrelated outputs, in order to prevent audible echoes and undesirable temporal smearing artifacts, the total delay applied in a decorrelator is often fairly small, such as on the order of 10 to 30 ms. This means that the number of mutually independent outputs, if required, is limited. In practice, only two or three outputs can be constructed by delays that are mutually significantly decorrelated, and do not suffer from the aforementioned downmix artifacts.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.