The perceptual coding of audio signals for the purpose of data reduction for efficient storage or transmission of these signals is a widely used practice. In particular when lowest bit rates are to be achieved, the employed coding leads to a reduction of audio quality that often is primarily caused by a limitation at the encoder side of the audio signal bandwidth to be transmitted. In contemporary codecs well-known methods exist for the decoder-side signal restoration through audio signal Band Width Extension (BWE), e.g. Spectral Band Replication (SBR).
In low bit rate coding, often also so-called noise-filling is employed. Prominent spectral regions that have been quantized to zero due to strict bitrate constraints are filled with synthetic noise in the decoder.
Usually, both techniques are combined in low bitrate coding applications. Moreover, integrated solutions such as Intelligent Gap Filling (IGF) exist that combine audio coding, noise-filling and spectral gap filling.
However, all these methods have in common that in a first step the baseband or core audio signal is reconstructed using waveform decoding and noise-filling, and in a second step the BWE or the IGF processing is performed using the readily reconstructed signal. This leads to the fact that the same noise values that have been filled in the baseband by noise-filling during reconstruction are used for regenerating the missing parts in the highband (in BWE) or for filling remaining spectral gaps (in IGF). Using highly correlated noise for reconstructing multiple spectral regions in BWE or IGF may lead to perceptual impairments.
Relevant topics in the state-of-art comprise                SBR as a post processor to waveform decoding [1-3]        AAC PNS [4]        MPEG-D USAC noise-filling [5]        G.719 and G.722.1C [6]        MPEG-H 3D IGF [8]        
The following papers and patent applications describe methods that are considered to be relevant for the application:    [1] M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, Germany, 2002.    [2] S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, Germany, 2002.    [3] T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, Germany, 2002.    [4] J. Herre, D. Schulz, Extending the MPEG-4 AAC Codec by Perceptual Noise Substitution, Audio Engineering Society 104th Convention, Preprint 4720, Amsterdam, Netherlands, 1998    [5] European Patent application EP2304720 USAC noise-filling    [6] ITU-T Recommendations G.719 and G.221C    [7] EP 2704142    [8] EP 13177350
Audio signals processed with these methods suffer from artifacts such as roughness, modulation distortions and a timbre perceived as unpleasant, in particular at low bit rate and consequently low bandwidth and/or the occurrence of spectral holes in the LF range. The reason for this is, as will be explained below, primarily the fact that the reconstructed components of the extended or gap filled spectrum are based on one or more direct copies containing noise from the baseband. The temporal modulations resulting from said unwanted correlation in reconstructed noise are audible in a disturbing manner as perceptual roughness or objectionable distortion. All existing methods like mp3+SBR, AAC+SBR, USAC, G.719 and G.722.1C, and also MPEG-H 3D IGF first do a complete core decoding including noise-filling before filling spectral gaps or the highband with copied or mirrored spectral data from the core.