The invention relates to a method of processing a plural channel audio signal including left and right channels, the information in the channels representing a three dimensional sound-field for generation by respective left and right loudspeakers arranged at a given distance from the preferred position of a listener in use.
The processing of audio signals to reproduce a three dimensional sound-field on replay to a listener having two ears has been a goal for inventors for many years. One approach has been to use many sound reproduction channels to surround the listener with a multiplicity of sound sources such as loudspeakers. Another approach has been to use a dummy head having microphones positioned in the auditory canals of artificial ears to make sound recordings for headphone listening. An especially promising approach to the binaural synthesis of such a sound-field has been described in EP-B-0689756, which describes the synthesis of a sound-field using a pair of loudspeakers and only two signal channels, the sound-field nevertheless having directional information allowing a listener to perceive sound sources appearing to lie anywhere on a sphere surrounding the head of a listener placed at the centre of the sphere.
The goal of researchers developing and studying the synthesis of 3D sound-fields from conventional two speaker systems has been to provide for complete and effective transaural crosstalk cancellation.
The fundamental Head Response Transfer Function (HRTF) characteristics which are required to implement a transaural acoustic crosstalk cancellation scheme are the left- and right-ear transfer functions associated with the azimuth angle at which the loudspeakers are situated (FIG. 1). For most applications, this is commonly accepted to be xc2x130xc2x0. The near-ear function is sometimes referred to as the xe2x80x9csamexe2x80x9d side function (or xe2x80x9cSxe2x80x9d function), and the far-ear function as the xe2x80x9calternatexe2x80x9d (or xe2x80x9cAxe2x80x9d) function. These A and S characteristics form the basis of all transaural acoustic crosstalk cancellation schemes (FIG. 2). Transaural acoustic crosstalk cancellation is described in more detail in WO 95/15069, which is incorporated herein by reference, and from which application the present application is a continuation-in-part. The A and S functions are combined to form filter blocks of the form:                     (                  1                      1            -                          C              2                                      )                            (        1        )            
(where C=(xe2x88x92A/S)), and:                     (                  1          S                )                            (        2        )            
These terms are often compounded together and simplify to form:                     (                  S                                    S              2                        -                          A              2                                      )                            (        3        )            
It is not possible to obtain reliable measurements of HRTF data (A and S) at low frequencies for several reasons, including the following.
1. Poor LF Response of Measurement Actuator (Loudspeaker)
In practise, it is known to make measurements from an artificial head in order to derive a library of HRTF data. It is common practise to make these measurements at distances of 1 meter or thereabouts, for several reasons. Firstly, the sound source used for such measurements is, ideally, a point source, and usually a loudspeaker is used. However, there is a physical limit on the minimum size of loudspeaker diaphragms. Typically, a diameter of several inches is as small as is practical whilst retaining the power capability and low-distortion properties which are needed. Hence, in order to have the effects of these loudspeaker signals representative of a point source, the loudspeaker must be spaced at a distance of around 1 meter from the artificial head. (As it is often required to create sound effects for PC games and the like which possess apparent distances of several meters or greater, and so, because there is little difference between HRTFs measured at 1 meter and those measured at much greater distances, the 1 meter measurement is used.) However, loudspeakers of this size and configuration possess very poor LF performance, and their LF response begins to fail at frequencies of around 200 Hz and below.
2. Poor LF Response of Measurement Sensor (Microphone in Artificial Head)
3. DC Offsets in Instrumentation
It is not uncommon to find spurious DC level offsets of 5-10 mV in digital tape recorders and other instruments used in HRTF measurements. (A DC offset corresponds directly to a gain error at 0 Hz.)
4. Wind Pressure Artefacts
In an anechoic measurement chamber, external wind pressure can cause significant pressure fluctuations within the chamber, giving rise to substantially large data offsets. Consequently, it is convenient to filter off the LF components of the HRTF signals prior to recording them, thus making the mid and high frequency information reliable and reproducible, but at the expense of loss of LF data.
5. Standing Waves
Even in an anechoic chamber, residual reflected energy can combine to cause standing waves. and these are most apparent at long wavelengths, hence procedures used for (4), above are doubly useful.
6. Impulse Measurement Method
HRTFs are measured by means of impulse responses, and this measurement does not provide LF data, because there is insufficient energy in the transient impulse below around 200 Hz. Even when a xe2x80x9cstretchxe2x80x9d pulse method is used, this is still the case.
7. Time Domain Windowing
When measuring HRTFs, it is essential to xe2x80x9cwindowxe2x80x9d the measured impulses in the time domain to a period of several milliseconds in order to eliminate incorporating reflected waves into the measurement (even in an anechoic chamber), and this cuts off the spectrum of the resultant data, again, below around 200 Hz.
As a consequence, HRTFs measured by the prior art methods do not contain LF information, although, of course, the LF response is present in reality. The results of a typical HRTF measurement are shown in FIG. 3, depicting the A and S functions at 30xc2x0 azimuth, measured from a commercial artificial head. The uncertainty in the non-valid data, below several hundred Hz, is apparent. Accordingly, the missing LF properties must be replaced in order to create valid HRTFs, and this is conveniently done by extrapolating the amplitude data at the lowest valid frequency (200 Hz) back to 0 Hz (or in practise, back to the lowest practical frequency, say 10 Hz). However, although the LF amplitude data do not contain a great deal of xe2x80x9cdetailxe2x80x9d (unlike the HF characteristics), and therefore it might be supposed that back-extrapolation might be simple, it is not entirely straightforward. This is because the HRTF curves are not flat at the lowest valid frequency, but still curving, and the near- and far-ear characteristics exhibit slightly differently shaped curves. Consequently, one must make an intelligent estimate of the y-axis intercept, and extrapolate both curves accordingly, as is shown in FIG. 4. Any LF errors can create significant quality problems, as low-frequency artefacts are very noticeable in high quality audio applications, often termed xe2x80x9cphase errorsxe2x80x9d. For this reason any LF errors in the processing must be avoided), and so in practice both near- and far-ear characteristics of the HRTF are extrapolated to the same value at low frequencies.
Prior art transaural crosstalk cancellation methods have always used A and S functions which tend to the same value at low frequencies (see for example, Atal and Schroeder, U.S. Pat. No. 3,236,949). Using such functions, the anticipated crosstalk signal at the far ear is equal to the primary signal at the near ear at low frequencies, hence the ratio of crosstalk signal to primary signal is always 1:1 at low frequencies.
According to a first aspect of the invention there is provided a method A method of processing a plural channel audio signal including left and right channels, the information in the channels representing a three dimensional sound-field for generation by respective left and right loudspeakers arranged at a distance from the preferred position of a listener in use, the method including:
a) choosing a distance between said loudspeakers and said preferred position;
b) determining from the magnitude of this chosen distance an optimal amount of transaural acoustic crosstalk compensation, said optimal amount being a function of the chosen distance; and
c) applying said optimal amount of crosstalk compensation to said left and right channels.
Preferably, the method further includes choosing an angle between the left channel loudspeaker and the right channel loudspeaker as viewed from said preferred position, and determining from both said chosen angle and said chosen distance an optimal amount of transaural acoustic crosstalk compensation, said optimal amount being a function of both the chosen angle and the chosen distance.
According to a second aspect of the invention there is provided Transaurual acoustic crosstalk filter means being constructed and arranged for performing the said method. According to a third aspect of the invention there is provided an audio signal produced by said method. A further aspect of the invention provides apparatus according to claims 8 and 9.