Embodiments according to the invention relate to processing audio scenes and in particular to an apparatus and a method for changing an audio scene and an apparatus and a method for generating a directional function.
The production process of audio content consists of three important steps: recording, mixing and mastering. During the recording process, the musicians are recorded and a large number of separate audio files are generated. In order to generate a format, which can be distributed, these audio data are combined to a standard format, like stereo or 5.1 surround. During the mixing process, a large number of processing devices are involved in order to generate the desired signals, which are played back over a given speaker system. After mixing the signals of the musicians, these can no longer be separated or processed separately. The last step is the mastering of the final audio data format. In this step, the overall impression is adjusted or, when several sources are compiled for a single medium (e.g. CD), the characteristics of the sources are matched during this step.
In the context of channel-based audio representation, mastering is a process processing the final audio signals for the different speakers. In comparison, in the previous production step of mixing, a large number of audio signals are processed and processed in order to achieve a speaker-based reproduction or representation, e.g. left and right. In the mastering stage, only the two signals left and right are processed. This process is important in order to adjust the overall balance or frequency distribution of the content.
In the context of an object-based scene representation, the speaker signals are generated on the reproduction side. This means, a master in terms of speaker audio signals does not exist. Nevertheless, the production step of mastering is required to adapt and optimize the content.
Different audio effect processing schemes exist which extract a feature of an audio signal and modify the processing stage by using this feature. In “Dynamic Panner: An Adaptive Digital Audio Effect for Spatial Audio, Morrell, Martin; Reis, Joshua presented at the 127th AES Convention, 2009”, a method for automatic panning (acoustically placing a sound in the audio scene) of audio data using the extracted feature is described. Thereby, the features are extracted from the audio stream. Another specific effect of this type has been published in “Concept, Design, and Implementation of a General Dynamic Parametric Equalizer, Wise, Duane K., JAES Volume 57 Issue ½ pp. 16-28; January 2009”. In this case, an equalizer is controlled by features extracted from an audio stream. With regard to the object-based scene description, a system and a method have been published in “System and method for transmitting/receiving object-based audio, Patent application US 2007/0101249”. In this document, a complete content chain for object-based scene description has been disclosed. Dedicated mastering processing is disclosed, for example, in “Multichannel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions, Patent application US2005/0141728”. This patent application describes the adaptation of a number of audio streams to a given loudspeaker layout by setting the amplifications of the loudspeaker and the matrix of the signals.
Generally, flexible processing, in particular of object-based audio content, is desirable for changing audio scenes or for generating, processing or amplifying audio effects.