1. Technical Field
The present disclosure relates to upsampling of images and, more specifically, to guided image upsampling of an image to increase resolution.
2. Related Art
Image upsampling, or generation of an image at a greater, i.e., higher resolution than an initial image, is a fundamental image processing problem. The need for generating a higher resolution image from an initial image is rapidly increasing, in part, because of the rate at which content is being created, and because of the enhanced resolution of display devices, including the latest smartphones, tablets, and high-definition televisions. There is a particular need for increasing the resolution of images for display on such devices from traditionally low resolution images, such as images sent via email, downloaded from the Internet, or streamed from a content provider.
Upsampling is typically achieved by convolving an initial (lower resolution) image with an interpolation kernel or filter, and resampling the result on a target (higher resolution) grid. Interpolation filters, such as bilinear and bicubic filters, for use in such upsampling processes are well-known. Such methods include setting the value of a pixel in the target image at a higher resolution (“HR”) as a weighted average of the values in the neighborhood of the pixel in an initial lower resolution (“LR”) image. While the same filter is applied across the entire image for computational efficiency, these methods generate unsatisfactory artifacts in many cases. These artifacts are most perceptible along sharp edges within the image content, and can take the form of blurring, aliasing, ringing, and blocking.
To address these shortcomings, many new adaptive image interpolation algorithms have been developed, which can generally be classified into three different categories: edge-guided; example-based or “super-resolution” algorithms; and “other.” The majority of known methods are edge-guided. Most edge-guided methods are based on interpolating the missing higher resolution pixels in the target (HR) image using estimated local covariance in the target image from the local covariance coefficients in the initial LR image.
Example-based super-resolution algorithms typically utilize a database to look-up, or learn, correspondences between the initial lower resolution image and target image patches. These databases can be constructed ahead of time from large image collections, but have more recently been constructed on the fly using only the initial input image, or only a small portion of it.
Additional methods that fall outside the edge-based and example-based categories include game-theoretic approaches to upsampling, which generally incorporate an iterative feedback-control loop. While this technique can lead to high quality results, performance degradation can occur when the number of iterations of the loop increase.
Another image filter is the well-known “bilateral filter,” which is an edge-preserving non-linear filter that uses both a spatial (or domain) filter kernel and a range filter kernel evaluated on the data values themselves. It has been used primarily in denoising and tone mapping applications. In particular, the rationale behind the bilateral filter is that for a pixel to influence another pixel, it should not only occupy a nearby spatial location but should also have a similar color intensity value. More formally, the bilateral filtered result for a pixel p is:
                                                        BF              ⁡                              [                I                ]                                      p                    =                                    1                              w                p                                      ⁢                                          ∑                                  q                  ∈                  S                                                                                              ⁢                                                          ⁢                                                I                  q                                ⁢                                                      G                                          σ                      ⁢                                                                                          ⁢                      s                                                        ⁡                                      (                                                                                        p                        -                        q                                                                                    )                                                  ⁢                                                      G                                          σ                      ⁢                                                                                          ⁢                      r                                                        ⁡                                      (                                                                                                                  I                          p                                                -                                                  I                          q                                                                                                            )                                                                                      ,                            (        1        )            where I is the input image, Gσs and Gσr, which are typically 2D Gaussian kernels, represent a spatial filter and range filter, respectively, and S (in qεS) denotes the neighborhood of interest in the spatial domain. Wp is a normalization factor that ensures that the pixel weights sum to one.
One variant of the bilateral filter is the Joint Bilateral Filter (“JBF”). In the JBF, a second so-called “guiding image” (Ĩ), which is of the same resolution as the input image, is used to define the edges that must be preserved:
                                          JBF            ⁡                          [              I              ]                                p                =                              1                          w              p                                ⁢                                    ∑                              q                ∈                S                                                                                  ⁢                                          I                q                            ⁢                                                G                                      σ                    ⁢                                                                                  ⁢                    s                                                  ⁡                                  (                                                                                p                      -                      q                                                                            )                                            ⁢                                                                    G                                          σ                      ⁢                                                                                          ⁢                      r                                                        ⁡                                      (                                                                                                                                              I                            ~                                                    p                                                -                                                                              I                            ~                                                    q                                                                                                            )                                                  .                                                                        (        2        )            While straight-forward implementations of the bilateral filter and its variants are computationally expensive, there have been attempts to significantly improve running times by approximating the filter equation. For example, a guided filter that is based on a local linear model and that provides a faster and superior edge-preserving filter is disclosed in K. He, et al., “Guided image filtering,” IEEE Trans. Pattern Analysis Machine Intelligence (Accepted), vol. 99, no. PrePrints, pp. 1-14, 2012 (referred to herein as “He”), the entirety of which is incorporated herein by reference.
The methodology of the Joint Bilateral Filter has been successfully applied to a particular upsampling problem that arises from the need to reduce memory costs associated with performing computationally intensive processing tasks, such as stereo depth, image colorization, and tone mapping. In particular, an input image is first downsampled to a lower resolution in order to perform the necessary image processing tasks at computationally manageable levels. After processing, the low resolution solution must then be upsampled to the original resolution of the input image.
One technique for upsampling the low resolution solution, which is based on the Joint Bilateral Filter, is disclosed in Kopf, J., et al., “Joint Bilateral Upsampling,” ACM Transactions on Graphics, Vol. 26, No. 3, pp. 96-100 (2007) (referred to herein as “Kopf”), the entirety of which is incorporated herein by reference. Given a high-resolution guiding image I, and a low resolution solution L obtained after processing an input image, Kopf discloses that an upsampled solution can be obtained by applying a “Joint Bilateral Upsampling” (referred to also as “JBU”) technique as follows:
                                          JBU            ⁡                          [                              L                ~                            ]                                p                =                              1                          w              p                                ⁢                                    ∑                                                q                  ↓                                ∈                S                                                                                  ⁢                                          L                                  q                  ↓                                            ⁢                                                G                                      σ                    ⁢                                                                                  ⁢                    s                                                  ⁡                                  (                                                                                                        p                        ↓                                            -                                              q                        ↓                                                                                                  )                                            ⁢                                                G                                      σ                    ⁢                                                                                  ⁢                    r                                                  ⁡                                  (                                                                                                        I                        p                                            -                                              I                        q                                                                                                  )                                                                                        (        3        )            
where p and q denote integer coordinates of the pixels in I, and p ↓ and q ↓ denote the corresponding coordinates in the low resolution solution L. The guiding image I for interpolating to the upsampled resolution in the JBU technique disclosed in the Kopf reference is the original input image. Accordingly, unlike the application of a Joint Bilateral Filter as shown in equation (2), the JBU of equation (3) operates at two different resolutions: the high resolution of the guiding image and low resolution of the downsampled low resolution solution L.