Digital images are currently used in many different applications. These applications include new-generation acquisition devices, such as digital still cameras (DCS). The availability of sensors providing increased resolutions at lower costs, such as low consumption digital signal processors (DSP), have led to a considerable commercial diffusion of digital still cameras. For this reason, there is now a need for low-cost acquisition devices that will also make possible the acquisition of high-quality digital images.
The quality of an image substantially depends on the characteristics of the sensor that acquires the image, especially its resolution. The sensor, which in digital still cameras, will typically be either a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor). These sensors are integrated circuits comprising a matrix of photosensitive cells or elements, each associated with a corresponding pixel. When the image is acquired from a real scene, each cell produces an electrical signal proportional to the light that strikes it. More precisely, each cell responds to the radiance (emitted quantity of light) of a particular portion of the real scene. These particular portions of the real scene form the receptive field of the pixel.
The larger the number of photosensitive cells or the greater the spatial resolution of the sensor (which amounts to the same thing), the information relating to the real scene captured in the acquisition process will be denser. But the choice of obtaining a higher image resolution by stepping up the sensor resolution in terms of the number of pixels is not always feasible because of technology and cost.
When acquiring a digital photograph, a sensor, no matter how good its resolution, will always produce an approximation of the scene that is to be shot. The photosensitive cells of the sensor are always separated by a certain distance because not all of the sensor area can be uniformly covered with photosensitive elements. Technology makes it inevitable that there should be a certain minimum distance between adjacent cells. This spacing between adjacent cells is the cause of a first loss of information in the acquisition process.
Another reason why a digital image acquired with a sensor of the type that is commonly used in digital still cameras forms only an approximation of the real scene is due to the interpolation process of the data acquired by the sensor. As is well known, a digital image can be represented by a matrix of elements (pixels) corresponding to elementary portions of the image, and each of these elements has associated with it one or more digital values representative of the optical components. In a monochromatic image, for example, only a single digital value is associated with each pixel. In this case, it is usually said that the image is made up of only a single channel or plane.
In a color image, which may be in a RGB (Red, Green and Blue) format, each pixel has associated with it three digital values that correspond, respectively, to the three components (red, green, blue) of the additive chromatic synthesis. In this case, the image can be broken down into three distinct planes. Each plane contains the information relating to just one of the chromatic components.
A typical sensor will dedicate a single and substantially monochromatic photosensitive cell to each pixel of the image. Furthermore, the sensor is provided with an optical filter that includes a matrix of filtering elements, each of which covers one photosensitive cell. Subject to a minimal absorption, each filtering element transmits to the photosensitive cell with which it is associated the luminous radiation corresponding to the wavelength of only the red light, only the green light or only the blue light. For each pixel there is thus revealed just one of the three primary components (R, G, B) of the additive chromatic synthesis.
The type of filter employed varies from one manufacturer to another. The most common is the so-called Bayer filter. In this filter the arrangement of the filtering elements, the so-called Bayer pattern, is as shown in the element matrix 10 reproduced in FIG. 2.
The electrical signals produced by the photosensitive cells are converted into digital values in accordance with conventional methods. The digital image obtained in this manner is incomplete because it is made up of only a single component (R, G or B) for each pixel. The format of this image is conventionally referred to as a CFA (Color Filter Array) image.
The CFA image is then subjected to a complex reconstruction process to produce a “complete” image in the RGB format, for example, in which three digital values will be associated with each pixel. This reconstruction implies a passage from a representation of the image in a single plane (Bayer plane) to a representation in three planes (R, G, B). The reconstruction is obtained by well known interpolation algorithms.
It should be noted that the interpolation produces only an approximation of the image that would be obtained with a sensor capable of acquiring three optical components per pixel. Therefore, the interpolation process introduces yet another approximation into the acquired image. Given these limitations of the quality of the acquired image introduced by the sensor characteristics and the interpolation process, further processing operations are often required to obtain a high-resolution digital image.
The prior art proposes numerous methods that are generally based on the principle of reconstructing the original information of the real scene lost in the acquisition process for the reasons set out above, by combining the information contained in a plurality of initially acquired low-resolution digital images that all represent the same scene. To this end, it is essential that the initially acquired images, which will be referred to more briefly as the starting images, should together form some additional information that could not be obtained from identical images.
Some of the known methods comprise operate in the space domain (that is, in the pixel domain) and others in the frequency domain. The latter combines a certain number of low-resolution starting images after having transformed them in the spatial frequency domain. After the image in the frequency domain obtained from this combination has been brought back into the space domain, it has a better resolution than the starting images. However, the methods operating in the frequency domain call for a very considerable computational effort.
The methods that operate in the space domain, on the other hand, comprise a particular class that employs an approach known as “back projection”, which is very similar to the one utilized, for example, in so-called computerized axial tomography (CAT), in which a two-dimensional object is reconstructed from a series of one-dimensional projections thereof.
The back-projection approach assumes that the low-resolution starting images of the same scene represent different projections of a high-resolution image that reproduces the real scene. The projection operator is the same acquisition process, which depends to a large extent on the acquisition device, is assumed to be known. The problem is thus reduced to reconstructing the high-resolution image from its various projections.
In particular, the method employed by M. Irani and S. Peleg, described among others in “Super Resolution From Image Sequences” (IEEE, 1990), obtains an iterative reconstruction of the high-resolution image by correcting/improving this image in several successive steps on the basis of the differences between the starting images and images obtained by simulation from the projections of the high-resolution image as from time to time corrected or improved by iteration.
This method is associated with a first drawback that derives from the fact that obtaining high-quality images calls for an accurate modeling of the acquisition process or device that obtains the low-resolution images. For this reason, the method in question is complicated and does not lend itself to being implemented in a commercial acquisition device, such as a digital still camera.
A second difficulty is based upon the fact that the method calls for a considerable number of processing operations at each iteration step and this, in its turn, implies numerous problems in devices in which the optimization of the energy, processing and memorization resources are a factor that has an important bearing on their commercial success.