Natural reverberation, also abbreviated reverb, is the effect of gradual decay of sound resulting from reflections off surfaces in a confined room. The sound emanating from its source strikes wall surfaces and is reflected off them at various angles. Some of these reflections are perceived immediately while others continue being reflected off other surfaces until being perceived. Hard and massive surfaces reflect the sound with moderate attenuation, while softer surfaces absorb much of the sound, especially the high frequency components. The combination of room size, complexity, angle of the walls, nature of surfaces and room contents define the room's sound characteristics and thus the reverb.
Since reverb is a time-invariant effect, it can be recreated by applying a room impulse response to an audio signal either during recording or during playback. The room impulse response can be understood as a room's response to an instantaneous, all-frequency sound burst in the form of reverberation and typically looks like decaying noise. If a digitised room impulse response is available, digital signal processing allows adding an exact room characteristic to any digitized “dry” sound. Also it is possible to place an audio signal into different spaces just by utilizing different room impulse responses.
The transmission and use of real, i. e. of measured, room impulse responses for the reproduction of sound signals with this room characteristic has been the object of research and development in recent years. For using MPEG-4 as defined in the MPEG-4 Audio and Systems standard ISO/IEC 14496the transmission of long impulse responses turned out to be difficult due to the following problems:                1. Room impulse responses can be loaded into an MPEG-4 player as MPEG-4 ‘sample dumps’, which is a technique that requires a full Structured Audio (SA, MPEG-4 audio programming language) implementation including MIDI with the appropriate MIDI and SA profiles. This solution has extreme high demands for code, complexity and execution power and, therefore, is nowadays impracticable for MPEG-4 players—and may even not be available in future devices.        2. Making use of synthetic room impulse responses by using the ‘DirectiveSound’ node, which is defined especially for Virtual Reality applications has the disadvantage that such parametric synthetic room impulse responses differ significantly from real measured room impulse responses and have a far less natural sound.        3. Adding a new node specifically designed for the transmission and use of real room impulse responses is undesired due to the above mentioned existing possible but not optimal solutions 1. and 2. and since the introduction of new nodes shall be avoided whenever possible.        4. Applying the same coding for the transmission of room impulse responses as for the audio signals itself is not reasonable. Typical MPEG audio encoding schemes take advantage of psychoacoustic phenomena, which are especially suited for reducing the audio data rate by suppressing unperceivable audio signal parts. However, since room impulse responses are related not to the human ear but to the rooms's characteristic applying psychoacoustics to room impulses would lead to falsifications.        