When an imaging device such as a camera takes pictures under one or more sources of light, the image off the sensor will have a color bias depending on the color temperature of the specific source(s) of light. For example, in scenes with a light generated from a tungsten source, un-modified pictures may have an overall yellowish-orange cast. Under natural lighting during twilight however, images will often have a very bluish cast. In order to mitigate the potentially heavy color biasing that occurs under varying light conditions, adjustments are typically performed either internally within the device or during the processing phase to balance the sensor response so that the resulting images appear relatively normalized to the human eye. This process is referred to as white balancing.
According to contemporary photographic techniques, each pixel in a scene or image can be represented as a vector with one dimension for each of a multitude of color channels. For example, in a three color image, each pixel can be represented as a three dimensional vector (e.g., typically the vector [R,G,B]). This vector can be projected down to a lower dimensional space, such as by transforming it to a two-dimensional luminance/chrominance color space (viz, the YUV color space). A YUV pixel can be represented by just its color terms as a two dimensional vector [u,v] for a given luminance (y). The YUV space assumes that white balance has already been performed, so for colorimetry, a space called xy (based on the response of the human eye) or the related little-r, little-b space (based on a particular sensor) are commonly used instead. All these space share the property that light intensity is factored out and only color is considered. Points in such a space are thus called chromaticity coordinates. In the xy space, colors of natural illuminants will have a distribution that falls along a smooth curve called the Planckian locus. The human visual system has evolved to recognize a wide range of natural lights as color neutral, but will perceive lights with a significantly different chromaticity as colorful. Artificial lights are, therefore, generally designed to produce a chromaticity that lies near the Planckian locus.
There exist several approaches to automatic white balancing. In several conventional approaches, characteristics of an image (e.g., the coordinates of the pixels comprising the image) are used to estimate the color of the illumination. This estimated illumination, represented as a value, is subsequently factored out of the pixel colors. A popular method is known as the “Gray World” approach. According to the Gray World method, the color values corresponding to pixels of an image are averaged and the average color of the image is used as the estimated color of the illuminant (and thus, removed). Factors of scale on each color channel are chosen so that the average color, after scaling is performed, results in a neutral (gray) color value.
Unfortunately, one of the major failings of standard gray world is that it typically will have very poor performance in scenes with a dominant color, or large colored surfaces, such as a close-up of a human face, or an image dominated by large portions of blue sky. For these scenes, the gray world technique will over bias the illuminant color (e.g., as human skin, or blue). After the illumination is factored out, the resulting skin or sky can appear overly gray (neutral), with all other objects having an unintended (and inaccurate) hue. The default gray world approach also has the problem that there is nothing preventing it from causing actual gray objects in scenes to be rendered inaccurately green or magenta as a result of applying the bias from the overall scene, and it is even possible in certain cases where gray world performs worse than a fixed white balance.
Furthermore, the estimated illuminant color derived from the average of the pixel values can be sub-optimal for the purposes of normalization. In certain circumstances, the estimated illuminant color can be a highly unlikely color for an illuminant and factoring the illuminant color out of the image will result in images with distorted coloring. For example, in scenes with mostly green foliage, the average color value will be a value that approximates some shade of green. According to the Gray World model, the illuminant will be estimated as a green light and will be subsequently factored out of the image, thus resulting in foliage that appears neutral, i.e., gray, and adversely affecting the appearance of the image.
There are a number of conventional alternatives to the Gray World technique, and the technique itself is often considered a poor method of white balancing in academic literature. However, Gray World remains a popular theory for white balancing implementations in practice due to its robustness to sensor variation, simplicity of implementation, computational speed, and general stability. Previous attempts to improve Gray world performance include constraining the Gray World estimate so that it corresponds to a light, a “difference Gray world” that uses only edge samples to reduce the impact of solid colored objects, and various methods that modify the contribution of a sample to the white balance estimates as a function of sample brightness. However, each of these attempts, while providing advantages over the basic Gray World solution, unfortunately still suffer from some flaws. Constraining the estimate still fails to address the fundamental problems of Gray World, and merely reduces the worst-case error. Difference Gray World helps reduce the impact of large solid colors, but reduces the number of viable samples which will increase sensitivity to scene change and also tends to have a bias towards highly textured surfaces. Methods that weight bright samples more heavily have severe problems in indoor scenes where the outdoors is visible, as sunlight tends to be much brighter than interior lighting.