Three dimensional (3D) audio is based on binaural technology and the study of head-related transfer functions (HRTFs). The impulse response from a sound source 140 in 3D space to one of the ears 120, 130 of a listener 110 is called head-related impulse response (HRIR), as illustrated in FIG. 1. The corresponding HRTF may be determined as a Fourier transform of HRIR.
The transfer function corresponding to the impulse response from the source 140 to the near ear 120 (the ear on the same side of the head as the source) is called ipsi-lateral HRTF and the transfer function corresponding to the impulse response from the source 140 to the far ear 130 (the ear on the opposite side of the head as the source) is called the contra-lateral HRTF.
Furthermore, the sound that arrives at the far ear 130 is slightly delayed relative to the sound at the near ear 120, and this delay is referred to as the Interaural Time Difference (ITD). In practice, the duration of an HRTF may be of the order of 1 ms, and the ITD may be smaller than 1 ms.
A single virtual source is conventionally implemented by using two digital filters and a delay as shown in FIG. 2a where the overall gains used to emulate the distance attenuation is omitted for convenience.
The filters Hi 210 and Hc 220 correspond to ipsi-lateral and contra-lateral HRTFs, respectively, and the ITD 225 is inserted in the contra-lateral path (which goes to the listener's 110 left ear 130 when the source 140 is on the right as in the example shown in FIG. 1). The filters Hi 210 and Hc 220 are commonly implemented by FIR filters.
A modest performance improvement may be achieved by replacing Hc 220 by a low-order UR filter IATF (Interaural Transfer Function) 240 that processes the output from Hi 210, as depicted in FIG. 2b. 
The two filter structures represent alternative implementations of the same algorithm when the cascade of 210 and IATF 240 is approximately equal to Hc 220. In practice the IATF can be chosen for example as a first order lowpass filter with good results.
When more virtual sources are needed, more copies of the structure in FIG. 2a or FIG. 2b are required. It is not possible to combine the processing blocks Hi 210, Hc 220 IATF 240 and ITD 225 because they are specific to the position of each virtual source.
An alternative method based on the principal component analysis (PCA) technique can be used for the implementation of virtual sound sources. This approach differs from the filtering method shown in FIG. 2b (or FIG. 2a) in the sense that it uses a set of filters having unvarying frequency or impulse response characteristics and a set of gains varying with sound source location. These filters and gains are derived through PCA, a statistical analysis technique that allows the extraction of some common trends in the data (note that Singular Value Decomposition (SVD) and Karhunen-Loeve expansion are variants of this technique). In practice, an HRIR or HRTF dataset is arranged in a two-way array where each column represents the response at one ear for a given sound source position, where the sound source position is determined by a single parameter, not enabling e.g. distinction between elevations and azimuth angles associated with the position. A PCA is then applied to this matrix.
The outcome of this statistical decomposition is a set of N orthogonal basis functions representing the desired unvarying filters and N sets of gains corresponding to the N orthogonal basis functions, each of the N sets of gains, each set comprising a gain value corresponding to each of the sound source positions represented by the original HRTF dataset. Therefore, an approximation of any of the original HRIR or HRTF filters can be reconstructed by a linear combination of the basis functions by multiplying each basis function by a gain value associated with respective sound source position.