There is an increasing demand for digital signal processing techniques that address the need for extreme signal manipulations in order to fit pre-recorded audio signals, e.g. taken from a database, into a new musical context. In order to do so, high level semantic signal properties like pitch, musical key and scale mode are needed to be adapted. All these manipulations have in common that they aim at substantially altering the musical properties of the original audio material while preserving subjective sound quality as good as possible. In other words, these edits strongly change the audio material musical content but, nevertheless, may be used to preserve the naturalness of the processed audio sample and thus maintain believability. This ideally involves signal processing methods that are broadly applicable to different classes of signals including polyphonic mixed music content.
Today, many concepts for modifying audio signals are known. Some of these concepts are based on vocoders.
For example, in “S. Disch and B. Edler, “An amplitude- and frequency modulation vocoder for audio signal processing,” Proc. of the Int. Conf on Digital Audio Effects (DAFx), 2008.”, “S. Disch and B. Edler, “Multiband perceptual modulation analysis, processing and Synthesis of audio signals,” Proc. of the IEEE-ICASSP, 2009.” or “S. Disch and B. Edler, “An iterative segmentation algorithm for audio signal spectra depending on estimated local centers of gravity,” 12th International Conference on Digital Audio Effects (DAFx-09), 2009.”, the concept of the modulation vocoder (MODVOC) has been introduced and its general capability to perform a meaningful selective transposition on polyphonic music content has been pointed out. This renders applications possible which aim at changing the key mode of pre-recorded PCM music samples (see for example “S. Disch and B. Edler, “Multiband perceptual modulation analysis, processing and Synthesis of audio signals,” Proc. of the IEEE-ICASSP, 2009.”). Also a first commercially available software which can handle such a polyphonic manipulation task (Melodyne editor by Celemony) is available. The software implements a technology which has been branded and marketed by the term direct note access (DNA). A patent application (EP2099024, P. Neubäcker, “Method for acoustic object-oriented analysis and note object-oriented processing of polyphonic sound recordings,” September 2009.) has been published lately, presumably covering and thus disclosing the essential functionality of DNA. Independent from the method used for modifying an audio signal, it is desired to obtain an audio signal with high perceptual quality.