Removing noise from a video signal before the signal is encoded is an important feature of most modern video encoding architectures, since it can considerably enhance coding efficiency while at the same time improve objective and subjective quality of the resulting encoded video signal. Digital still or video pictures can contain noise due to the capturing process, the analog to digital conversion process, transcoding along the delivery channel, transmission effects, or other reasons. Of course, noise causes effects that a user can perceive in the video display, causing a visually displeasing picture. It can also have a severe adverse effect in many video applications, particularly video compression. Due to its random nature, noise can considerably decrease spatial and temporal correlation, thus limiting the coding efficiency of such noisy video signals. Furthermore, at low bit rates, the uncorrelated compression artifacts between successive pictures coded with different encoding modules can lead to temporal artifacts in the way of flicker or pulsation between pictures. Thus, it is desirable to remove noise. However, it is important to also not remove any of the important details of the picture, such as edges or texture.
Several conventional algorithms exist where removal of noise, or de-noising, is performed using spatial or/and temporal methods. Such noise reduction schemes can be spatial in nature, addressing one frame at time. Conventional spatial algorithms tend to remove spatially redundant information and noise. Conventional temporal schemes, apart from removing noise and enhancing details such as edges that may be lost due to spatial filtering, also tend to enhance temporal correlation between adjacent frames. However, these conventional architectures consider this process outside the encoder loop. As a result, no consideration of the artifacts introduced by the encoding process is made.
Many noise reduction schemes in the context of pre-processing that occur prior to compression address coding efficiency and improved subjective quality compared to coding an unfiltered source. In this context, knowledge of the encoding process could lead to further improvements both subjectively and objectively, but to date have not been considered. Conventional temporal filtering methods may consider motion compensated methods for advanced performance. However, feedback typically exists from the encoder in terms of adapting certain parameters of the filtering process, such as those based on the target bit rate, increasing or decreasing the filtering applied on the current picture. These methods still do not include any information about the nature of previously coded pictures.
Conventional schemes can be used for addressing coding efficiency and subjective quality compared to coding an unfiltered source, but none exists that adequately addresses temporal artifacts that are apparent as defects in the resulting video picture. More specifically, it can be observed that at very low bitrates using fixed GOP (Group Of Pictures) structures (i.e. a repetitive sequence of intra-coded (I) pictures followed by a sequence of inter-coded (P and B) pictures) can result in distinct temporal artifacts (i.e. a pumping/beating/pulsation picture effect) at GOP boundaries. These artifacts are a result of the different coding artifacts introduced by the different picture/prediction coding types, and the lack of temporal correlation at GOP boundaries. These artifacts are apparent in all existing video compression standards, such as MPEG-2[1] and MPEG-4, but can be even more prominent for standards such as JVT/H.264/MPEG AVC [2], where additional processes are applied for intra and inter coding, including the prediction process and de-blocking. These artifacts can persist even though a conventional spatio-temporal pre-filtering scheme is used, regardless of the resulting increase in temporal correlation between adjacent original filtered pictures.
Therefore, given conventional solutions, there still exists a need for adequately removing such artifacts from a video picture. As will be seen, the invention resolves this need in an elegant manner.