The present invention relates to the field of audio signal processing and, in particular, discloses efficient convolution methods for the convolution of input audio signals with impulse response functions or the like.
In International PCT Application No. PCT/AU93/00330 entitled xe2x80x9cDigital Filter Having High Accuracy and Efficiencyxe2x80x9d filed by the present applicant, there is disclosed a process of convolution which has an extremely low latency in addition to allowing for effective long convolution of detailed impulse response functions.
It is known to utilize the convolution of impulse response functions to add xe2x80x9ccolorxe2x80x9d to audio signals so that when, for example, playback over headphones, the signals provide for an xe2x80x9cout of headxe2x80x9d listening experience. Unfortunately, the process of convolution, whilst utilizing advanced algorithmic techniques such as the fast fourier transform (FFT), often requires excessive computational time. The computational requirements are often increased when multiple channels must be independently convolved as is often the case when full surround sound capabilities are required. Modem DSP processors are often unable to provide for the resources for full convolution of signals, especially where real time restrictions are placed on the latency of the convolution.
Hence, there is a general need to reduce the processing requirements of a full convolution system whilst substantially maintaining the overall quality of the convolution process.
In accordance with a first aspect of the present invention, there is provided a method of processing a series of input audio signals representing a series of virtual audio sound sources placed at predetermined positions around a listener to produce a reduced set of audio output signals for playback over speaker devices placed around a listener, the method comprising the steps of: (a) for each of the input audio signals and for each of the audio output signals: (i) convolving the input audio signals with an initial head portion of a corresponding impulse response mapping substantially the initial sound and early reflections for an impulse response of a corresponding virtual audio source to a corresponding speaker device so as to form a series of initial responses; (b) for each of the input audio signals and for each of the audio output signals: (i) forming a combined mix from the audio input signals; and (ii) determining a single convolution tail; (iii) convolving the combined mix with the single convolution tail to form a combined tail response; (c) for each of the audio output signals: (i) combining a corresponding series of initial responses and a corresponding combined tail response to form the audio output signal.
The single convolution tail can be formed by combining the tails of the corresponding impulse responses. Alternatively, the single convolution tail can be a chosen one of the virtual speaker tail impulse responses. Ideally, the method further comprises the step of preprocessing the impulse response functions by: (a) constructing a set of corresponding impulse response functions; (b) dividing the impulse response functions into a number of segments; (c) for a predetermined number of the segments, reducing the impulse response values at the ends of the segments.
The input audio signals are preferably translated into the frequency domain and the convolution can be carried out in the frequency domain. The impulse response functions can be simplified in the frequency domain by zeroing higher frequency coefficients and eliminating multiplication steps where the zeroed higher frequency coefficients are preferably utilized.
The convolutions are preferably carried out utilizing a low latency convolution process. The low latency convolution process preferably can include the steps of: transforming first predetermined block sized portions of the input audio signals into corresponding frequency domain input coefficient blocks; transforming second predetermined block sized portions of the impulse responses signals into corresponding frequency domain impulse coefficient blocks; combining the each of the frequency domain input coefficient blocks with predetermined ones of the corresponding frequency domain impulse coefficient blocks in a predetermined manner to produce combined output blocks; adding together predetermined ones of the combined output blocks to produce frequency domain output responses for each of the audio output signals; transforming the frequency domain output responses into corresponding time domain audio output signals; outputting the time domain audio output signals.
In accordance with a further aspect of the present invention, there is provided a method of processing a series of input audio signals representing a series of virtual audio sound sources placed at predetermined positions around a listener to produce a reduced set of audio output signals for playback over speaker devices placed around a listener, the method comprising the steps of: (a) forming a series of impulse response functions mapping substantially a corresponding virtual audio source to a corresponding speaker device; (b) dividing the impulse response functions into a number of segments; (c) for a predetermined number of the segments, reducing the impulse response values at the ends of the segment to produce modified impulse responses; (d) for each of the input audio signals and for each of the audio output signals: (i) convolving the input audio signals with portions of a corresponding modified impulse response mapping substantially a corresponding virtual audio source to a corresponding speaker device.
In accordance with a further aspect of the present invention, there is provided a method for providing for the simultaneous convolution of multiple audio signals representing audio signals from different first sound sources, so as to simulate an audio environment for projection from a second series of output sound sources comprising the steps of: (a) independently filtering each of the multiple audio signals with a first initial portion of an impulse response function substantially mapping the first sound sources when placed in the audio environment: and (b) providing for the combined reverberant tail filtering of the multiple audio signals with a reverberant tail filter formed from subsequent portions of the impulse response functions.
The filtering can occur via convolution in the frequency domain and the audio signals are preferably first transformed into the frequency domain. The series of input audio signals can include a left front channel signal, a right front channel signal, a front centre channel signal, a left back channel signal and a right back channel signal. The audio output signals can comprise left and right headphone output signals.
The present invention can be implemented in a number of different ways. For example, utilising a skip protection processor unit located inside a CD-ROM player unit, utilising a dedicated integrated circuit comprising a modified form of a digital to analog converter; utilising a dedicated or programmable Digital Signal Processor; or utilizing a DSP processor interconnected between an Analog to Digital Convener and a Digital to Analog Convener. Alternatively, the invention can be implemented using a separately detachable external device connected intermediate of a sound output signal generator and a pair of headphones, the sound output signals being output in a digital form for processing by the external device.
Further modifications can include utilizing a variable control to alter the impulse response functions in a predetermined manner.