The basic principle of so-called surround processors is to enhance a two-channel stereophonic source signal so as to drive a multiplicity of loudspeakers arranged to surround the listener, in a manner to provide a high-definition soundfield directly comparable to discrete multitrack sources in perceived performance. An illusion of space may thus be created enabling the listener to experience the fullness, directional quality and aural dimension or "spaciousness" of the original sound environment. The foregoing so-called periphonic reproduction of sound can be distinguished from the operation of conventional soundfield processors which rely on digitally generated time delay of audio signals to simulate reverberation or "ambience" associated with live sound events. These conventional systems do not directionally localize sounds based on information from the original performance space and the resulting reverberation characteristics are noticeably artificial.
Within the home and commercial entertainment field, extensive research and development has been conducted in the area of surround processors and in particular with regard to decoding apparatus for the decoding of audio signals encoded by phase and amplitude matrixing onto two channels, for transmission or recording using stereophonic media. In multichannel decoding apparatus according to the prior art, there are both fixed matrix decoders and variable matrix decoders. Fixed matrix decoders are those in which a plurality of input signals containing encoded information relating to the directions of sound sources are summed in appropriate proportions and phases to yield a plurality of output signals suitable, after amplification, for driving a corresponding plurality of surrounding loudspeakers in a room, the process being describable in terms of a matrix transformation in which the matrix coefficients are fixed and time-invariant. The optimum performance of such decoders occurs when the decoding matrix is the pseudo-inverse of the encoding matrix, and no further improvement in performance is possible unless the coefficients can be varied dynamically.
Variable matrix decoders also matrix a plurality of encoded input signals to produce a plurality of output signals suitable for driving a multichannel loudspeaker system, but the decoding matrix coefficients do not remain fixed. Instead, they are varied by means of a directionally sensing and control system, which continually monitors the correlations in phase and amplitude ratios between the input signals and adjusts the decoding coefficients to provide the maximum possible enhancement of directional cues for the most prominent sound sources at any instant in time. So-called "logic steering" or dynamic separation enhancement techniques typical of variable matrix decoders are described in Scheiber, U.S. Pat. No. 3,632,886; Bauer, U.S. Pat. No. 3,708,631; Ito and Takahashi, U.S. Pat. No. 3,836,715; Kameoka et al., U.S. Pat. No. 3,864,516; Tsurushima, U.S. Pat. No. 3,883,692; Gravereaux et al., U.S. Pat. No. 3,943,287; Willcocks, U.S. Pat. No. 3,944,735; and Scheiber, U.S. Pat. No. 4,704,728. While the detailed logic steering circuitry and methods used to implement the variation of decoding matrix coefficients in these and numerous other matrix decoders differ, all of the known decoder systems utilize means for determining from the signals present at their input terminals the predominant components of the soundfield, and then deriving therefrom a number of control signals, which are in turn used to vary gain parameters of the decoder and thereby modify the decoding coefficients to optimize the directional cues in the reproduction of those sounds.
For a well-designed decoder system, the control signals and their sum generally behave to provide correct separation, localization and placement of individual predominant sound sources. However, careful attention must also be paid to psychoacoustic performance where the control signals and their corresponding matrix coefficients vary, to ensure a natural perception of sound by the ear-brain combination. Where extreme dynamic conditions cause the control signals to vary quickly to follow all the variations of predominant directionality; the resulting presentation can suffer from an anomaly known as "pumping" or "breathing", since it is clearly obvious when a channel is turned on or off. Other audible problems known by those skilled in the art to occur include intermodulation distortion, mislocalization or apparent wandering of sound sources and modulation of noise or rumble associated with the signals.
Some of the prior art decoder systems have attempted to address the foregoing. Willcocks, U.S. Pat. No. 3,944,735 describes an attack and decay time constant processor section wherein each control signal is stored on a capacitor which is discharged at a variable rate depending upon the relative strength of other control signals present. The "attack" time constants refer to the charging time of each of these capacitors and are always short, so as to generate a fast control signal responsive to the new predominant source. The decay time constants refer to the discharge time of these capacitors and allow the control signal associated with the then predominant sound direction to fall slowly, thus providing a smooth, more realistic sound.
While the provision of a fast-attack/slow-decay time constant processing circuit has some benefit, a side effect is that the sum of the control coefficient signals can exceed the optimum level, causing more severe level variations and deterioration of the sharpness of localization under some circumstances. Further, as rapid changes in the predominant source occur, the dynamic separation suffers since the signal that was predominant is still decaying and the effective direction sensed by the logic steering circuitry is different from the actual direction of the predominant source. Thus, where a system is slowed sufficiently to be smooth in all circumstances, it will have inferior separation in response to music with well-defined "attacks" from different encoded directions. Attacks in this sense refer to rapid increases of the audio signal amplitude envelope.
Scheiber, U.S. Pat. No. 4,704,728 describes a method for adjustment of both attack and decay time constants in accordance with overall signal levels and with detected attacks in the signal content, employing a slew-rate limiting technique. However, the slow decay time constants are generally too slow, resulting in smooth but nondefinitive performance. Also, as the signal falls the time constants become even slower, which has been found to be undesirable. The only valid context for this to occur is when the signal-to-noise ratio drops to such a level that control signals are mainly being generated in response to random noise. Further, the attack sensing circuitry and associated method of responding to signal attacks does not permit fast control signal variations to occur in a short enough period of time to avoid audible distortion effects and is not controlled to the extent required for optimum performance.
Heretofore unrealized improvements in psychoacoustic performance of such decoder systems would therefore include attack and decay time constants which are continuously variable over a wide range, and varied in response to both the strength of the individual control signal and the rate of change of the control signals occurring prior to the generation of these time constants. The effect would be that audio signal attacks are detected and responded to with very brief periods of shortening of time constants, with longer and smoother time constants restored as soon as the attack demand has been met.
Improvement of the dynamic separation performance of decoders has also been attempted by split-band processing. Split-band processing allows for improved audio separation and thus improved directional effects since the separation occurs over a smaller audio signal frequency range, as opposed to being averaged over the entire frequency band. The noise and distortion at lower frequencies caused by imperfections in the presentation are also effectively eliminated by band-specific processing techniques. However, known split-band surround processors typically employ a filter network for first receiving input signals in the direct audio path and splitting the signals into high and low-frequency bands, which are then processed by two separate decoders, one for the high and one for the low-frequency band. The provision of multiple decoders and associated circuitry complicates these arrangements and adds significantly to their cost. Further, the placement of filters in the audio path has a tendency to degrade the audio signal because of the added stages and summing techniques.