Among image signal processing technologies, a technology of removing noise included in an image is an essential technology for reproducing a captured image more clearly.
In relation to a technology of removing noise in a captured image, PTLs 1 to 3 and NPLs 1 and 2 disclose the following technologies.
PTL 1 discloses a method of improving image quality of an image through wavelet transformation and inverse wavelet transformation.
PTL 2 discloses a method of creating a tentative high-resolution image (base image) by enlarging an input image to an output size.
PTL 3 discloses a method of wavelet-transforming an original image, restoring damage of the image by interpolation with respect to a low-frequency component, performing an inverse wavelet transformation by use of the restored low-frequency component and a high-frequency component, and reconstructing the image to obtain a final restored image.
NPL 1 discloses a technique of denoising processing (referring to noise removal processing; hereinafter the same) based on component separation of an image.
NPL 2 discloses a technique of denoising processing based on wavelet shrinkage.
For the purpose of facilitating understanding of the present invention, outlines of the technologies disclosed in NPLs 1 and 2 will be briefly described below.
First the technology in NPL 1 will be described.
FIG. 15 is a conceptual diagram for illustrating the technology in NPL 1.
First, a structure-texture decomposition (STD) unit 1001 separates an input original image signal fin into a structure component u composed of an edge component and a flat component of the image, and a texture component v composed of noise and a fine pattern. A total variation minimization (TV) method in NPL 3, a bilateral filter in NPL 4, or the like may be used for the separation.
Next, a texture component (TC) shrinkage unit 1002 applies processing of suppressing a noise component in the texture component v to generate a noise-suppressed texture component v′. While various methods may be applied to the noise suppression processing, soft-decision threshold processing expressed as equation (1) below is effective.v′=Sign(v)×max(|v|−τ,0)   (1)
Sign(v) in the equation is a function returning a sign of v, and τ denotes noise attenuation. Alternatively, hard-decision threshold processing expressed as equation (2) below may be applied.
                              v          ′                =                  {                                                    0                                                                                                      v                                                        ≤                  τ                                                                                    v                                                                                                      v                                                        >                  τ                                                                                        (        2        )            
Finally, a combining unit 1003 combines the structure component u with the noise-suppressed texture component v′ to generate an output image signal.
The TV method being one of the techniques of separating an input image signal into a structure component u and a texture component v will be described. The structure component u can be obtained by introducing a regularization term to a total variation norm TV(u) expressed as equation (3) below, and minimizing equation (4) below.
                              TV          ⁡                      (            u            )                          =                              ∫            Ω                                                          ⁢                                                                  ∇                u                                                    ⁢            dx                                              (        3        )                                          TV          ⁡                      (            u            )                          +                              μ            2                    ⁢                                    ∫              Ω                                                                    ⁢                                                            (                                      u                    -                                          u                      0                                                        )                                2                            ⁢              dx                                                          (        4        )            
In equation (4), u0 denotes an original image signal f, and μ is a parameter indicating fidelity to the original image signal.
Methods for solving equation (3) include a digital TV filter (DTVF) in NPL 5. It is assumed that a pixel value of an image u at a pixel position α=(i,j) is denoted as uα. It is further assumed that a set of neighborhood pixel positions of α is denoted as N(α). When eight neighbors are assumed as the neighborhood,N(α)={(i,j±1), (i±1,j), (i+1,j±1), (i−1,j±1)}holds. In the DTVF, filtering processing based on a local variation is used for solving equation (4). Assuming that an input image signal is denoted as u(0), and an output image signal after N sets of filtering is denoted as u(N), a filter output uα(N) at the pixel position α is expressed as equation (5) below.
                              u          α                      (            n            )                          =                                            ∑                              β                ∈                                  N                  ⁡                                      (                    α                    )                                                                        ⁢                                          h                αβ                            ⁢                              u                β                                  (                                      n                    -                    1                                    )                                                              +                                    h              αα                        ⁢                          u              α                              (                0                )                                                                        (        5        )            
In equation (5), hαβ and hαα are filter coefficients (referring to coefficients in filtering processing; hereinafter the same), and are expressed as equations (6), (7), and (8) below.
                              h          αβ                =                                            w              αβ                        ⁡                          (              u              )                                            u            +                                          ∑                                  γ                  ∈                                      N                    ⁡                                          (                      α                      )                                                                                  ⁢                                                w                  αγ                                ⁡                                  (                  u                  )                                                                                        (        6        )                                          h          αα                =                  u                      u            +                                          ∑                                  γ                  ∈                                      N                    ⁡                                          (                      α                      )                                                                                  ⁢                                                w                  αγ                                ⁡                                  (                  u                  )                                                                                        (        7        )                                                      w            αγ                    ⁡                      (            u            )                          =                              1                                                                          ∇                  α                                ⁢                u                                                            +                      1                                                                          ∇                  γ                                ⁢                u                                                                                      (        8        )            
In equation (8), |∇αu| and |∇γu| denote local variations defined by equation (9) below.
                                                                  ∇              α                        ⁢            u                                    =                                            ∑                              β                ∈                                  N                  ⁡                                      (                    α                    )                                                                        ⁢                                                  ⁢                                          (                                                      u                    β                                    -                                      u                    α                                                  )                            2                                                          (        9        )            
From the equations, when a local variation |∇βu| at a pixel β adjacent to α is sufficiently larger than a noise component, hαα becomes hαα≈1, and therefore blurring of an edge can be prevented. Conversely, when the local variation |∇βu| is small, the DTVF assumes the region to be flat, and hαα becomes hαα≈0, and therefore behaves like an ordinary low-pass filter. As for μ, μ may be determined as μ=1/σ2 using an estimated noise standard deviation σ. Practically, in order to prevent division by zero in equation (8),|∇αu|∈=√{square root over (|∇αu|2+ε2)}is used in place of the local variation |∇αu| .
Next, the technology in NPL 2 (wavelet shrinkage) will be described.
FIG. 16 is a conceptual diagram for illustrating the technology in NPL 2.
First, wavelet transformation (WT) units 2001, 2002, and 2003 separate an input image signal into a plurality of frequency components by wavelet transformation (multi-resolution decomposition).
While this example illustrates three-level wavelet transformation, the number of levels of transformation may be arbitrarily set. Naturally, the number of levels of transformation may also be arbitrarily set in the wavelet shrinkage in FIG. 16. Further, it is assumed in the description below that a resolution becomes lower as the number of levels becomes greater.
Next, a WC shrinkage unit 2004 applies processing of setting a wavelet coefficient with a small absolute value to zero (shrinkage processing) to high-frequency components LH3, HL3, and HH3 with the lowest resolution, to obtain LH3′, HL3′, and HH3′. Note that WC stands for wavelet coefficient. While the processing method includes various types, equations (1) and (2) may be simply used. When noise is random noise, a noise component included in the input image is distributed across all wavelet coefficients, and therefore noise can be removed by subtracting a noise portion from each wavelet coefficient. Then, an IWT unit 2005 generates a noise-suppressed low-frequency component LL2′ at a level one step higher than the lowest resolution, by an inverse wavelet transformation. Note that IWT stands for inverse WT, that is, inverse wavelet transformation. The generation is performed with a low-frequency component LL3 with the lowest resolution and high-frequency components LH3′, HL3′, and HH3′ with the lowest resolution, being applied with the shrinkage processing.
Subsequently, with regard to a resolution other than the lowest resolution, similarly to the processing at the lowest resolution, the shrinkage processing is successively applied to a high-frequency wavelet coefficient with the resolution concerned (WC shrinkage units 2006 and 2008). Then, from a low-frequency component with the resolution concerned obtained from a resolution one level lower than the resolution concerned and a high-frequency component with the resolution concerned, being applied with the shrinkage processing, a low-frequency component with a resolution one level higher than the resolution concerned is generated by an inverse wavelet transformation (IWT units 2007 and 2009). Then, an inverse wavelet transformation result at the highest resolution is determined to be an output image.
FIG. 17 is a diagram illustrating an application example of the technology in NPL 2. The left part of FIG. 17 is an input image, the middle part of FIG. 17 is a single-level wavelet transformation result, and the right part of FIG. 17 is a three-level wavelet transformation result. In the diagram, LL1 denotes an 1-th level low-frequency component, LH1, HL1, and HH1 denote 1-th level high-frequency components.
Further, in relation to the present invention, other technologies described in NPLs 3 to 7 are disclosed.