Chromagenic illuminant estimation exploits the relationship between RGBs captured with a conventional digital camera and RGBs captured when a coloured filter is placed in front of the camera. This approach has two problems. First, performance is fragile; occasionally the estimation is poor. Second, there is a requirement for registered images, yet typical chromagenic cameras (e.g. a stereo rig or two surveillance cameras) will have non registered pixels.
In embodiments of the present invention, we carry out a detailed colour space error analysis of chromagenic illuminant estimation and identify RGBs which will likely lead to good and poor performance. While the good and poor sets overlap they are not the same and we find that bright RGBs tend to yield correct illuminant estimates. The bright-chromagenic algorithm attempts to find these RGBs by selecting a fixed percentage of the brightest pixels in the filtered and unfiltered images using these for chromagenic estimation. This simple strategy leads to very good estimation performance. On a large set of images including synthetic, half-synthetic and real images the bright-chromagenic algorithm delivers excellent estimation which is at least as good as all antecedent colour constancy algorithms. Bright-chromagenic plus gamut mapping delivers estimation performance which is strictly better than all other algorithms tested. Because the selection of the brightest pixels is carried out independently for the filtered and unfiltered images, the bright-chromagenic algorithm does not need any image registration and this is idea is also validated in the experiments.
The human visual system is reasonably colour constant: the colour of objects are stable when viewed under different colours of light. However, it has proven difficult to emulate this colour constancy in manufactured devices. This is not only a problem in image reproduction (e.g. digital photography) but also for a variety of computer vision tasks, such as tracking [9], indexing [16] and scene analysis [10] where stable measures of reflectance are sought or assumed for objects in a scene.
Colour constancy is generally broken down into two parts. First, the colour of the prevailing illuminant is estimated. Then, at a second stage, the colour bias due to illumination is removed. This second part is in fact quite easy and so most colour constancy algorithms focus on the illuminant estimation problem. Starting with Land's retinex [12], numerous algorithms for illuminant estimation have been proposed. The first group of algorithms make simple assumptions about the scene being observed, such as MaxRGB, in which a maximally reflective patch exists in the image (e.g. a white reflectance or equally there are surfaces, such as yellow and blue, that added together would make white), or Gray World, in which the average reflectance in a scene is gray [3].
Another group of algorithms comprises more sophisticated approaches such as neural networks, colour by correlation, which is a Bayesian method that correlates the RGBs in the image with plausible RGBs under various illuminants to find the best illuminant [6] and gamut mapping methods [7]. The last approach exploits the observation that the range, or gamut, of colours recorded by a camera depends on the colour of the light. If an RGB does not fall inside the gamut for a given light, then that light cannot be the solution to illuminant estimation. The gamut constrained illuminant estimation algorithm of Finlayson and Hordley delivers the best performance over all algorithms tested on the Simon Fraser set of real images. However, this performance is bought at the price of quite a complex algorithm. This is generally true: the simple normalisation approaches deliver reasonable performance but, thus far, the best performance requires complex algorithmic inference.
Chromagenic theory proposes that the illuminant estimation problem is easier to solve if two images of a scene are recorded: the first image is captured as normal but the second image is captured through a coloured filter. This idea seems reasonable as we often take multiple images in computer vision to help solve problems that are hard in a single image. For example, in stereo, triangulation of two images are used to recover 3-d position of points in the scene, and in photometric vision, a pair of images captured with respect to two orthogonal polarising filters can be used to identify and remove specular highlights [13].
The standard chromagenic colour constancy algorithm [5] works in two stages. The training stage is a preprocessing step where the relationship, a linear mapping, between filtered and unfiltered RGBs is calculated for a number of candidate lights. Then, those relations are tested on other images in order to estimate the actual scene illuminant. Encouragingly, like a basic stereo algorithm, the basic chromagenic algorithm often works well and this indicates both that a linear map models the relationship between filtered and unfiltered RGBs well and that the maps for different lights are different from one another. Rather discouragingly, the basic algorithm can fail rather badly.
The chromagenic algorithm's poor performance is due to two problems. Firstly, the map that best models the relationship between filtered and unfiltered RGBs can correspond to the wrong light. That is to say that the precalculated maps do a good job of modelling the broad trends in the data but there are specific instances (combinations of reflectances) where they work more poorly. This problem is analogous to difficulties encountered in colour correction by mapping the colour a camera records for display. In general most camera colours are mapped correctly but there always colours in photographs that look wrong (e.g. the violet colour of the ‘morning glory’ flower is generally poorly reproduced). The second problem is more down to basic engineering. The chromagenic theory is predicated on the assumption that we have pixel-wise correspondence. Indeed, to achieve the best performance, one has to compare RGBs transitions that occur between identical reflectances. Recalling the analogy to stereo, we know that stereo works when we have good pixel correspondence but finding the correspondences is the essence of much stereo research. Similarly, experiments have shown that chromagenic illuminant estimation can work, but again we need appropriate correspondences.