Color is one of the main visual cues and has been studied extensively on many different levels, starting from the physics and psychophysics of color, to the use of color principles in practical problems. These include accurate rendering, display and reproduction, image filtering, coding, retrieval, and numerous other applications in scientific visualization, computer graphics, image and video processing. Interestingly, although color naming represents one of the most common visual tasks, it has not received significant attention in the engineering community. Yet today, with rapidly emerging visual technologies and multimedia, and the development of sophisticated user interfaces and human-machine interactions, the ability to name individual colors, point to objects of a certain color, and convey the impression of color composition becomes an increasingly important task. Color cues can be used in interactive visualization and computer graphics. Color naming facilitates natural user interface design. The extraction of higher-level color descriptors represents a challenging problem in image analysis and computer vision, as these descriptors often provide link to image content. When combined with image segmentation, it would be advantageous to be able to use color naming to select objects by color, describe the appearance of the image and even generate semantic annotations.
For example, regions labeled as light blue and strong green may represent sky and grass, vivid colors are typically found in man-made objects, while modifiers such as brownish, grayish and dark convey the impression of the atmosphere in the scene.
The applications mentioned so far use a flexible computational model for color categorization, color naming or extraction of color composition (i.e. color appearance of a given scene or image to a human observer). Modeling human behavior in color categorization involves solving, or at least providing some answers to several important problems. The first problem involves the definition of the basic color categories and “most representative examples”, called prototypical colors, which play a special role in structuring these color categories. Another issue is how to expand the notion of basic color terms into a “general” yet precise vocabulary of color names that can be used in different applications. The next problem involves the definition of category membership. Although the idea that color categories are formed around prototypical examples has received striking support in many studies, the mechanisms of color categorization and category membership are not yet fully understood.
According to the theories postulated to explain human perception, color vision is initiated in retina where the three types of cones receive the light stimulus. The cone responses are then coded into one achromatic and two antagonistic chromatic signals. These signals are interpreted in the cortex, in the context of other visual information received at the same time and the previously accumulated visual experience (memory). Once the intrinsic character of colored surface has been represented internally, one may think that the color processing is complete. However, an ever-present fact about human cognition is that people go beyond the purely perceptual experience to classify things as members of categories and attach linguistic labels to them. Color is no exception. That color categories are perceptually significant can be demonstrated by the “striped” appearance of the rainbow. In physical terms, the rainbow is just a light with the wavelength changing smoothly from 400-700 nm. The unmistakable stripes of color in the rainbow suggest an experimental basis for the articulation of color into at least some categories. However, to model color naming, it is not sufficient to define the color names as functions of the wavelength range. This would account only for pure monochromatic stimuli, which are very rare in real-world situations, and would also leave out non-spectral colors like brown, white and black. Breakthroughs in the current understanding of color categorization came from several sources. This includes a cross-cultural study, which studied the color naming behavior with subjects from variety of languages. Twenty languages were examined experimentally and another 78 through the literature review and discovered remarkable regularities in the shape of the basic color vocabulary. As a result of their study, a concept of basic color terms were introduced which lead to work on defining the color categories corresponding to these basic terms. Eleven basic terms were identified in English: black, white, red, green, yellow, blue, brown, pink, orange, purple and gray. Experiments also demonstrated that the humans perform much better in picking the “best example” for each of the color terms than in establishing the boundaries between the categories. This lead to the definition of focal colors representing the centers of color categories, and the hypothesis of graded (fuzzy) membership. Many later studies have proven this hypothesis, indicating that prototypical colors play a crucial role in internal representation of color categories, and the membership in color categories seem to be represented relative to the prototype. Unfortunately, the mechanism of color naming is still not completely understood. There exist few theoretical models of color naming based explicitly on neurophysiology of color vision and addressing the universality of color foci and graded membership. Apart from not being developed or implemented as full-fledged computational models, these have important drawbacks. In one model membership in color categories is formalized in terms of fuzzy set theory, by allowing the objects to be members of a given set to some degree. In terms of color categories, this means that a focal or prototypical color will be represented as having a membership degree of 1 for its category. Other, non-focal colors will have membership degrees that decrease systematically with the distance from the focal color in some color space. However, this model considers only four fuzzy sets (red, green, yellow and blue), and supporting other color terms requires the introduction of new and ad hoc fuzzy set operations. Furthermore, it is not clear how the non-spectral basic color categories, such as brown, pink and gray are to be dealt with, nor how to incorporate the learning of color names into the model. Another model defines four physical parameters of the stimulus: wavelength, intensity, purity and adaptation state of the retina. According to this model, the pre-cortical visual system performs analog-to-digital conversion of these four parameters, and represents eleven basic color categories as specific combinations of the quantized values. Although interesting for its attempt to take adaptation into account, this model is clearly a gross simplification, which cannot hold in general.
Although color spaces allow for color specification in unambiguous manner, in everyday life colors are mainly identified by their names. Although this requires a fairly general color vocabulary and is far from being precise, identifying a color by its name is a method of communication that everyone understands. Hence, there were several attempts towards designing a vocabulary, syntax and standard method for choosing color names. The Munsell color order system known to those skilled in the art, is widely used in applications requiring precise specification of colors. Examples include production of paints, textiles, etc. It is often used as an industry standard, complemented by Munsell's Book of Color which includes 1,200 precisely controlled samples of colors (chips). The chips are arranged such that unit steps between them are intended to be perceptually equal. Each chip is identified by a 3-part code. The brightness scale is represented by the Munsell value with black denoted by 0/ and white by 10/. Munsell chroma increases in steps of two (/2, /4, . . . , /10). The hue scale is divided into 10 hues: red (R), yellow-red (YR), yellow (Y), green-yellow (GY), green (G), blue-green (BG), blue (B), purple-blue (PB), purple (P), red-purple (RP), each hue can be further divided into ten sections. One notable disadvantage of the Munsell system for the color-based processing is the lack of the exact transform from any color spaces to Munsell. For example, a transform proposed by others is fairly complicated and sometimes inaccurate for certain regions of CIE XYZ.
The first listing of over 3000 English words and phrases used to name colors was devised by Maerz and Paul and published in a Dictionary of colors. Even more detailed was a dictionary published by The National Bureau of Standards. It included about 7500 different names that came to general use in specific fields such as biology, geology, philately, textile, dyes and paint industry. Both dictionaries include examples of rare or esoteric words, and the terms are listed in entirely unsystematic manner, making them unsuitable for general use. Following the recommendation of the Inter-Society Council, the National Bureau of Standards developed the ISCC-NBS dictionary of color names for 267 regions in color space. This dictionary employs English terms to describe colors along the three dimensions of the color space: hue, brightness and saturation. One problem with the ISCC-NBS model is the lack of systematic syntax. This was addressed during the design of a new Color-Naming System (CNS). The CNS was based in part on the ISCC-NBS model. It uses the same three dimensions, however the rules used to combine words from these dimensions are defined in a formal syntax. An extension of the CNS model, called the Color-Naming Method (CNM), uses a systematic syntax similar to the one described in the CNS model, and maps the color names from the CNM into color ranges in the Munsell system. All the aforementioned methods are closely related to the Munsell model and thus provide explanation on how to locate each name within the Munsell color space. However, it is not obvious how to use these methods to automatically attach a color name to a color sample, point out examples of named colors, describe the color region and objects in a scene, and ultimately communicate the color composition of the image.
One approach to these problems discloses a process for creating a color name dictionary and for querying an image by color name. The steps of the disclosed process are to identify a preferred color space, which is then divided into a plurality of color space segments, and a color name is assigned to each of the plurality of color segments. In accordance with this invention, a color name dictionary defines a set of the color names and color name boundaries, advantageously in a three-dimensional visually uniform color space. Each color name is represented by a volume in the color space. Given an input pixel, the color name is assigned using a disclosed method, which identifies the volume that includes the color value of the input pixel. However, many psychophysical experiments, have demonstrated that the humans perform much better in picking the “best example” for each of the color terms than in establishing the boundaries between the color names or color categories, and most importantly, that prototypical colors play a crucial role in internal representation of color categories, as the membership in color categories seem to be represented relative to the prototype.
The aforementioned approach also provides a method for querying image by color name. The steps of the disclosed process involve direct application of the color naming method to individual image pixels and computing the fractional count for each color name from the dictionary. To allow for more specific descriptions, the image is divided into a fixed set of regions defined by the image region dictionary (center, bottom, bottom left, etc.), the fractional counts are also computed for each region, and that representation is used to answer queries such as “Which of images in the database have most of color name red in top-right region”. However, this representation is not in agreement with the way humans perceive images and describe their color composition. Humans do not perceive image content as being in top or bottom right portion of the image—they perform logical analysis (image segmentation) and extract meaningful regions and objects from that image. Humans then describe these objects with a single color, e.g. “sky is blue”, not by the fractional count of the color names occurring within. Furthermore, it is well known that although digital images may include millions of colors, only a very small number of these are actually perceived. Therefore, the direct representation of the color name histogram does not match the representation generated by the human visual system.
A computational model that is better matched to human behavior in naming individual colors has been proposed in this method uses color naming data and applies a variant of the Gaussian normal distribution as a category model. However, this method is constrained to the lowest level of color naming, as it was fitted to the eleven basic color names. For example, although it allows for the intermediate hues, such as greenish yellow, the model does not account for commonly used saturation or luminance modifiers, such as vivid orange or light blue. Since the quality of color categorization depends on the intricate fitting procedure, there is no straightforward extension of the model to include these attributes and the model cannot be used with other sets of color names.
As may be appreciated, due to the shortcomings of the existing methodologies, there is a long-felt and unfulfilled need for a broader computational color naming method that will provide more detailed color descriptions and allow for the higher-level color communication to: automatically attach a color name to a color sample, point out examples of named colors, describe the color region and objects in a scene, and ultimately communicate the overall color composition of an image