1. Field of the Invention
The present invention relates generally to image processing techniques and, more particularly, to an image analyzing and expression adding apparatus for analyzing images such as quantified full-color pictures so that an operator may determine a suitable design plan according to the result of such analysis, the apparatus further allowing displayed images to be supplemented by necessary expressions in keeping with the design plan thus obtained.
2. Description of the Related Art
Today, desk-top publishing (abbreviated to DTP hereunder) is gaining widespread use and finding its way into small offices and households. Processing of images for use in DTP is becoming easier than before thanks to the development of image editing devices and image retouch systems. However, DTP has posed two major obstacles for ordinary users (i.e., nonspecialists) when it comes to adding suitable expressions to images such as photographs in a manner consistent with the purposes of the target document being prepared.
The first obstacle is that so-called suitable expressions to be added to an image are something that is difficult for people to understand in the first place. The second obstacle is the lack of a method for efficiently transmitting to the image processing system in use those locations within the image to which to add the desired expressions.
The second obstacle applies not only to ordinary users but also to specialists such as photo retouchers. There exist some conventional methods for designating necessary areas in images. One such method involves enclosing designated pixels with appropriate circles or rectangles. Another traditional method involves regarding contiguous pixels of similar colors as the same area and expanding that area collectively. One disadvantage of these methods is the difficulty in effectively designating areas in keeping with a given pattern. Yet another conventional method proposes applying area dividing techniques such as the k-means algorithm to designating image areas. This method is not very practical because the speed of processing drops considerably the larger the image. However, the second obstacle is being overcome by the recent development of algorithms of image segmentation such as the division k-means algorithm designed to divide large images into areas in a short time.
To return to the first obstacle, the so-called suitable expression to be added to the image merits scrutiny. In DTP, images are often pasted onto the target document. Suitable images are images that conform to the purposes of the document in question and assist in achieving such purposes effectively. For example, suppose that a bicycle retailer plans to stock a large number of mountain bicycles which are sturdy, designed to run at high speed, and operable by nonmuscular female users; that in a bid to sell the bicycles, the retailer develops a sales campaign and promotes it using leaflets targeted to the retailer's region; and that each of the leaflets has full-color photographs of the mountain bicycle being marketed. In this hypothetical case, the image of the leaflet as a whole needs to be determined, and the messages to be conveyed by the photographs are to be appropriately designed as well. To begin with, sensitive language (i.e., a term or a set of terms for expressing aspects of sensitivity or impressions) is used to express what is desired to be conveyed to potential customers. The terms of sensitive language to be used in the sales campaign are assumed to be selected from among those most commonly used, such as the terms listed in FIG. 3, "Color Image Scale" of Japanese Patent Laid-Open No. Sho 52-80043 (1977). (In this laid-open publication, images and impressions expressed in colors are represented by terms of sensitive language called color image terms.) Since the sales campaign is targeted to female users, the retailer wants the leaflet to convey the "gentleness" of the product for use by female customers, the "sportiness" of the product because the bicycle is designed to run at high speed, and the "safety" of the product because the bicycle while running is most unlikely to overturn. With the sales campaign targeted exclusively to female prospects, the retailer decides to put high priority to the term "gentleness" and medium priority to "sportiness" and "safety."
Suppose next that the leaflet is to be designed by a DTP system. First to be prepared are the layout of the leaflet and the photographs of the marketed product in compliance with the overall design plan. This is where some questions arise with respect to the photographs actually taken of the product and prepared for the campaign: Are the photographs fully representative of the "gentleness," "sportiness" and "safety" of the product in appropriate proportions? Are the images indicative of the retailer's originality and appealing to those who see them? Should the images be judged inadequate, where do they need to be retouched, and in what manner? These are the questions which stem from image analysis and which concern additional expressions based on such analysis.
To those who use the DTP system, answering the above questions requires specialized knowledge about design in general. It is almost impossible for them to answer these questions directly by simply referring to physical features of the image in question. This is a major barrier that hampers efforts to render the designing process automatic or semiautomatic.
How far the research on image analysis has advanced and what can be done based on what has been found so far will now be outlined in three aspects: symbolic, concrete and abstract. The symbolic aspect of image analysis involves examining images in terms of characters and symbols. The concrete aspect of image analysis involves interpreting images as concrete objects such as human beings and automobiles, while the abstract aspect involves analyzing images in terms of color surface areas and compositions as in abstract paintings. Japanese Patent Laid-Open No. Hei 5-225266 (1993) proposes a design apparatus for supporting design activities in their concrete aspect. However, as the preferred embodiment of the laid-open patent is shown to address only automotive parts, systems of the disclosed kind need to have their targets limited to certain fields. Otherwise the computers of such systems are incapable of processing huge quantities of image knowledge involved. At present, it is apparently difficult to implement general-purpose design apparatuses working on the concrete aspect of image analysis. Meanwhile, research on the abstract aspect of image analysis is making a significant progress promoted by a number of researchers. The abstract aspect of image analysis is particularly important for photographic images because their symbolic aspect was already determined when they were taken.
How the abstract aspect of images is analyzed is discussed illustratively by D. A. Dondis in "A primer of visual literacy", MIT Press, 1973, ("Katachi wa kataru" in Japanese, translated by Takayoshi Kaneko, Science-sha), and by Arnheim in "Arts and Vision" ("Bijutsu to shikaku" in Japanese, translated by Kanji Hatano and Yorio Seki, Bijutsu-Shuppansha). With respect to the synesthetic effects of colors, attempts have been made to grasp quantitatively the relations between the brightness of colors and their perceived heaviness as well as the relations between chromaticity and warmness (e.g., by Hideaki Chijiwa in "Chromatics" from Fukumura-Shuppan). In addition, the effects of colors are known to differ depending on the area occupied by each color, as described by Masako Ashizawa and Mitsuo Ikeda in "Area Effects of Chromatic Prominence" ("Iro no Medachi no Menseki-Koka" in Japanese, the periodical of the Japan Society of Chromatics, Vol. 18, No. 3, 1994). The chromatic effects of an image can be predicted to a certain extent with the colors and their areas in the image taken as parameters. The color image scale disclosed in the above-cited Japanese Patent Laid-Open No. Sho 52-80043 is an application of the synesthetic effects of colors to image analysis. The disclosed color image scale is prepared by polling a large number of people on the impressions they receive from colors (i.e., color images) and by laying out the results of the poll in a space composed of three axes: warm/cool, hard/soft, and clear/grayish. The color image scale is applied extensively not only to colors but also to other image elements in diverse fields of design activities. However, since this scale was originally provided to address colors alone, inconveniences have often been experienced. For example, while colors give impressions of warmness or coolness mainly on the basis of their chromaticity, textures like furs tend to give warm impressions and those of metal surfaces are likely to give cool impressions. It follows illustratively that the texture of a blue fur and that of a red metal surface are both located close to the origin of the color image scale. That is, images are more likely to be located near the origin of the scale the more complex their component elements become. This negates the initial purpose of the color image scale, i.e., that of classifying images. Although the texture of, say, a red metal surface has a specific significance, its image is mapped near the origin of the scale and is thus expressed as something featureless and bland. To circumvent this problem, the laid-open patent proposes examining more detailed image terms in advance and mapping these terms in the color-based coordinates, whereby the expressions of diverse images are made possible in a simple, three-dimensional space. This technique promises an improvement in image differentiation but tends to highlight awkward aspects of the mapping of a design composed of numerous elements on the basis of the proposed image terms. The reason for such awkwardness is that distances in the image space are originally based on colors. That is, a long distance in the image space does not necessarily mean a large difference between images.
Another solution to the above problems in the abstract aspect of image analysis is a method of having a computer compute numerous physical and psychophysical quantities of target image colors. The results of the computations are arranged into axes constituting a space in which to map terms of sensitive language through "fuzzy" processing. Such a method is supposed to analyze a large number of image expressions. The method is expected to be effective when used on designs with simple visual effects but will not be suitable for analyzing complicated visual effects. The major reason for this is the difficulty experienced in furnishing design knowledge to the image analyzing system in use. The system is required to ensure direct correspondence between numerous physical and psychophysical quantities on the one hand, and the types of sensitivity (i.e., elements of sensitive language) given as impressions on the other hand. It is an enormously time-consuming task for the designer not familiar with physical or psychophysical quantities to prepare design knowledge in a way usable by the system. If the correspondence between a large number of physical and psychophysical quantities and the image types (elements of sensitive language) is acquired through statistical processing in order to eliminate the designer's chore, individual design techniques tend to cancel out one another the more complicated the composition of the elements in the design in question. That is, it is difficult to determine which of the techniques in the design is of utmost significance. In the example above of the bicycle retailer's sales campaign, two of the three key concepts, "sportiness" and "safety," are located at two extreme ends of one axis on the color image scale of FIG. 3 in Japanese Patent Laid-Open No. Sho 52-80043 (the scale is one of a number of known image scales proposed at present). There is no way that these two concepts are both satisfied on the proposed scale. An attempt forcibly to satisfy the two concepts simultaneously will simply map their images close to the origin of the scale.
Furthermore, a good designer should first of all have a firm grasp of general design techniques and of the effects that such techniques can exert to viewers. Secondly, the designer should be able to fully project his or her originality in the works rendered. This means that a competent system should also have a solid grasp of general design techniques and of the effects that such techniques can exert to viewers. In addition, the system should be capable of providing its works with its own distinctiveness (i.e., designer's originality). Only such a system can adequately address the tasks of image analysis and processing. As described above, conventional image analyzing systems merely compute the images that viewers directly receive from physical and psychophysical quantities and are incapable of sufficient image analysis. A system for analyzing complicated image expressions needs a knowledge database structure ready to accommodate designers' knowledge and expertise. This kind of knowledge database requires arrangements for incorporating general design techniques, designers' originalities as well as the differences in image perception between viewers of, say, different age groups and ethnic origins.
Another disadvantage of existing image analyzing systems is their limited ability merely to indicate in percentage points how much of the purpose of the image in question will have been achieved overall. (In the example of the bicycle retailer's sales campaign, the purposes of the images are expressed in priorities 2, 1 and 1 put respectively to the concepts "gentleness," "sportiness" and "safety." The larger the number, the higher the priority.) Although the percentage thus obtained provides a rule of thumb in the subsequent steps of image retouch and expression supplementation, the result of the analysis cannot be used directly in retouching images or in adding suitable expressions to images.
As outlined above, there has yet to be an image analyzing system capable of checking to see if the concepts of, say, the "gentleness," "sportiness" and "safety" in the above-cited hypothetical bicycle retailer's sales campaign are suitably expressed using priorities 2, 1 and 1 respectively, and to see if the target document has its originality.
Techniques for adding expressions to images will now be outlined. An ideal technique would be one allowing the result of the image analysis to be utilized directly in retouching images or in supplementing images with additional expressions. Since such a technique does not exist, the description that follows will focus on techniques for adding expressions to images.
Adobe Systems Incorporated is marketing the software product called PhotoShop (TM) that permits photo retouch with DTP systems. The same company also offers the software product called PageMaker (TM) offering edit features. These products are inexpensive and may readily be purchased by individuals. As discussed earlier, for those who never experienced retouching photographs, it is no easy task to define those areas in the image to which to add expressions. This function has yet to be implemented by PhotoShop. Japanese Patent Laid-Open No. Hei 6-83924 (1994) proposes an image editing apparatus aimed at bringing about that function. The proposed apparatus divides a given image into segments that are labeled each, and allows the user to operate a pointing device to select collectively the pixels included in a given segment. Because the apparatus selects one of three factors, i.e., hue, chroma or value as image information used for segmentation, it suffers from the disadvantage of having long distances in an image grouped into a single segment each. Another disadvantage of the apparatus is that natural images used in printing tend to be large in size and thus take considerable time to segment.
Another method that has been practiced to convey additional expressions to the image editing system is one which allows a pen-like tool to be manipulated through the use of a pointing device in the same manner that a painter draws a painting. Another practiced method involves applying special effects to part or all of the image in question and/or carrying out color conversion on the image. There has also been proposed a similar method for performing color conversion by inputting not the values representing colors but the terms of sensitive language (image terms) mapped on the color image scale introduced earlier. Another proposed method involves classifying images in sensitive language and searching therethrough. In any case, it is not preferable to map on the color image scale the images containing complex information other than colors for the reasons discussed above.
A more serious problem is the absence of a system capable of carrying out the whole series of processing from image analysis to the addition of expressions to images in achieving the purposes of the target document using such images.
The major technical challenges discussed so far are summarized as follows:
(1) There exist no image analyzing and expressing systems capable of utilizing the result of image analysis in the process of adding expressions to the target image. PA1 (2) Operators having no design knowledge are unable to add suitable expressions to images. PA1 (3) Conventional image analyzing systems based on the existing image scales are incapable of fully analyzing an image that is expressed in those terms of sensitive language which have contradictory meanings. PA1 (4) Conventional expression adding systems based on the existing image scales, when using a pair of sensitive language terms having contradictory meanings, are incapable of adding the applicable expressions simultaneously to an image. PA1 (5) Designers' originalities cannot be expressed by conventional systems. PA1 (6) It is not easy for systems to acquire knowledge specific to individual designers. PA1 (7) Systems can only obtain simplified design knowledge. PA1 (8) There has yet to be an effective way of handling images.