If sound is emitted in a room, the sound waves travel across the space until they are reflected at the room boundaries. The reflections are again rebounded and over time a more and more complex pattern of sound waves evolves, the so-called reverberation. FIG. 8 shows a schematic single channel representation of reverberation which is an impulse response of a typical room with direct sound 1002, early reflections 1004 and late reverberation 1006. At a receiver position and as depicted at the abscissa of FIG. 8, first the direct sound 1002 is received from the receiver. The direct sound 1002 travels unreflectedly to the receiver. Afterwards, the early reflections 1004 are received. The early reflections 1004 consist of a number of distinct reflections, which over time condense to the late reverberation 1006. The direct sound 1002 and the earlier reflections 1004 are particularly dependent on the source and the receiver positions relative to the room geometry. The reflections in the late reverberation 1006 are characterized by being equally distributed in direction and relatively independent of the source and receiver positions.
However, in spatial reproduction every sound has a direction of arrival (DOA), i.e., the sound arrives from a certain angular direction given by azimuth and elevation. For a better illustration, FIG. 9 shows a schematic spatial representation of reverberation in only two dimensions. The DOA is clearly perceivable for the direct sound 1002 and determines mainly the source localization. The DOA is also important for the early reflections 1004 as it helps to create a sense of room geometry, spatial depth of the source and angular source localization. The late reverberation 1006 is diffuse and no explicit DOA can be perceived.
With an increase of time t, depicted at the abscissa, the receiver first perceives direct sound 1002 and afterwards the early reflections 1004 followed by late reverberation 1006. An angular direction is the azimuth angle of the direction of arrival of the sound wave, the azimuth angle depicted as radial dimension. The distance to the receiver is the time of arrival. The darkness of the points depicts the level of perceived level of reflection. Thus, FIG. 9 depicts a spatial representation of reverberation in two dimensions.
In the course of audio postproduction, artificial reverberation is added to the sound to enhance the spatial quality. The desired objectives range from enhancement of the musicality, improvement of the sound design to recreation of a physical acoustic space. A realistic acoustic space can be created by the use of multiple loudspeakers, source dependent early reflections and uncorrelated late reverberation. In this sense, it is referred to multichannel as having a high number of audio sources and a high number of output channels.
Practical reverberation algorithms generally fall into one of two categories, although hybrids exist:
1) delay networks, in which the input signal is delayed, filtered and fed back;
2) convolutional, wherein the input signal is simply convolved with a recorded or estimated impulse response of an acoustic space.
Convolutional reverberators reproduce a given acoustics with high precision, but also with high computational costs, i.e., efforts. Multichannel convolutional reverberators have been devised, but the computational costs scale linearly with the number of source and channel pairs.
For low channel applications, i.e., mono and stereo, a wide variety of parametric reverberators was developed. None of these developments, however, have been extended in an efficient manner to a high multichannel reverberator. In particular, they lack flexibility in coping with arbitrary source inputs and loudspeaker setups.
Many artificial reverberators have been developed in recent years, wherein in the following a brief overview of their application in multichannel reverberation is given. The vast majority of the commercially available reverberators have a low number of input and output channels. Whereas they have developed a high standard in usability, computational efficiency and sound quality, they scale inefficiently for high numbers of output channels.
One way to achieve a high number of channels using low channel reverberators is to instantiate multiple similar reverberators. This increases the memory requirements and computational costs considerably. For uncorrelated output channels the reverberators are parameterized differently, so they might become distinctive. It is possible to overcome distinctly receivable reverberators by cross-feeding signals between the reverberators.
However, the DOA of the early reflections cannot be implemented in this way as the desired DOA might be between the output channel of two reverberators. Consequently, there is no explicit way to position multiple sources by the means of the combination of multiple reverberators. Further, the usability for multiple instances can become awkward and complicated.
While convolution-based reverberators can produce a given physical acoustic space with high precision, as it is described, for example, in [1], they scale very inefficiently with a high number of sound sources and output channels. Each pair of sound source and output channel is processed by a separate convolution. Consequently, the number of convolutions to be performed is the product of the number of sound sources and output channels. The impulse responses are difficult to acquire and they lack flexibility in the source and receiver positioning of other room parameters.
In contrast, delay networks-based reverberators allow a wide control over any detail of the reverberated sound. Also, recently delay networks reverberators developed a high standard of sound quality in low channel applications. Currently existing algorithms do not or inefficiently offer a consistent approach to recreate multichannel audio with high efficiency.
Typically, the reverberation is created in two stages: the early reflections and the late reverberation as it is depicted in FIG. 10 and described in [2,3]. The early reflections 1004 and 1004 are delayed (1008a and 1008b) and attenuated (1012a and 1012b) copies of the monaural source 1014a and 1014b. The delay lines 1008a and 1008b, labeled as Tsi, the outtap gains 1012a and 1012, labeled as bsi and the panning 1016 are dependent on the source position and are exclusive to each source. Hence, for every source 1014a and 1014b, the early reflection section 1018 has to be duplicated. To enhance the quality of the early reflections 1004a and 1004b, they are processed by a diffusor unit 1022. The diffusor 1022 is typically implemented as an allpass filter or a short finite impulse response (FIR) filter to emulate the effect of non-specular wall reflections. The particular order and replacement of the diffusor 1022 and panning 1016 units can vary, e.g. for accurate panning of every single early reflection 1004a and 1004ba dedicated panning unit 1016 for each source 1014a and 1014b can be employed or the diffusor 1022 can be placed directly at the source input of the delay line 1008a and 1008b. Hence, the particular design is a tradeoff between detailed control and computational efficiency.
The late reverberation is created by the feedback delay network (FDN) 1024. The FDN 1024 is based around a set of N delay lines 1025, labeled as τ1, τ2, . . . , τN and a feedback mixing matrix A to evolve a complex echo pattern over time. The reverberation time and diffusion is controlled by the attenuation filters 1026, labeled as α1, α2, . . . , αN. The implementation of the attenuation filters ranges from a simple lowpass filter, as it is described in [4] to absorbent allpass filters as it is described in [5].
The early reflections are fed into the FDN loop to increase initial density of the delayed reverberation. Delayed reverberation is mixed and added to the panned early reflections. The resulting channels are fed into the loudspeakers 1028 of the reproduction room 1032. Optionally, a channel-dependent equalization filter (EQ) 1034 can be applied to the speaker channels for spectral corrections and speaker dependent frequency response.
In the listening position, all output channels in the reproduction room 160 are delayed and summed up and form the receiver signal. Hence, premixing of the delay line signals as it is typically performed in the prior design, increases the echo density in every output channel, but does not increase the echo density perceived in the room. It rather tends to introduce unpleasant coherence and comb-like filter artifacts. One extreme example, which may occur with a Hadamard mixing matrix, is to distribute the output of a delay line to all output channels, which creates a multichannel mono signal with a phase flip.
Designs of known concepts have no efficient and convenient way to handle multichannel reverberation including spatial cues and direction-dependency. Further, early reflections, which are most important for the spatial perception of the reverberator are rendered separately by known concepts, what is computational costly.
Currently, many different multi-speaker configurations exist, meaning that multichannel reverberations with flexible speaker configurations are highly necessitated. Hence, for example, there is a need for audio reproduction concepts, allowing for multichannel reverberators with a more flexible speaker configuration and/or for an efficient way for obtaining the reverberations.